si 614 community structure in networks
DESCRIPTION
SI 614 Community structure in networks. Lecture 17. Outline. One mode networks and cohesive subgroups measures of cohesion types of subgroups Affiliation networks team assembly. Why care about group cohesion?. opinion formation and uniformity. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/1.jpg)
School of InformationUniversity of Michigan
SI 614Community structure in networks
Lecture 17
![Page 2: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/2.jpg)
Outline One mode networks and cohesive subgroups
measures of cohesion types of subgroups
Affiliation networks
team assembly
![Page 3: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/3.jpg)
Why care about group cohesion? opinion formation and uniformity
if each node adopts the opinion of the majority of its neighbors, it is possible to have different opinions in different cohesive subgroups
![Page 4: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/4.jpg)
within a cohesive subgroup – greater uniformity
![Page 5: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/5.jpg)
Other reasons to care Discover communities of practice (more on this next
time)
Measure isolation of groups
Threshold processes: I will adopt an innovation if some number of my contacts do I will vote for a measure if a fraction of my contacts do
![Page 6: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/6.jpg)
What properties indicate cohesion? mutuality of ties
everybody in the group knows everybody else closeness or reachability of subgroup members
individuals are separated by at most n hops frequency of ties among members
everybody in the group has links to at least k others in the group relative frequency of ties among subgroup members
compared to nonmembers
![Page 7: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/7.jpg)
Cliques Every member of the group has links to every other
member Cliques can overlap
overlapping cliques of size 3 clique of size 4
![Page 8: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/8.jpg)
Considerations in using cliques as subgroups Not robust
one missing link can disqualify a clique Not interesting
everybody is connected to everybody else no core-periphery structure no centrality measures apply
How cliques overlap can be more interesting than that they exist
Pajek remember from class on motifs:
construct a network that is a clique of the desired size Nets>Fragment (1 in 2)>Find
![Page 9: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/9.jpg)
a less stingy definition of cohesive subgroups: k cores
Each node within a group is connected to k other nodes in the group
3 core4 core
Pajek: Net>Partitions>Core>Input,Output,AllAssigns each vertex to the largest k-core it belongs to
![Page 10: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/10.jpg)
subgroups based on reachability and diameter n – cliques
maximal distance between any two nodes in subgroup is n
2-cliques
theoretical justification information flow through intermediaries
![Page 11: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/11.jpg)
frequency of in group ties Compare # of in-group ties
Given number of edges incident on nodes in the group, what is the probabilitythat the observed fraction of them fall within the group?
The smaller the probability – the stronger the cohesion
within-group ties
ties from group to nodes external to the group
![Page 12: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/12.jpg)
considerations with n-cliques problem
diameter may be greater than n n-clique may be disconnected (paths go through nodes not in
subgroup)
2 – cliquediameter = 3
path outside the 2-clique
fix n-club: maximal subgraph of diameter 2
![Page 13: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/13.jpg)
cohesion in directed and weighted networks something we’ve already learned how to do:
find strongly connected components
keep only a subset of ties before finding connected components reciprocal ties edge weight above a threshold
![Page 14: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/14.jpg)
1 23
4 567 8
9 10 111213
141516
1718
19
20
21
22 2324
25 2627
28 29 30
31 32
3334 35 36
37 38 39
40
1 DigbysBlog2 JamesWalcott3 Pandagon4 blog.johnkerry.com5 OliverWillis6 AmericaBlog7 CrookedTimber8 DailyKos9 AmericanProspect10Eschaton11Wonkette12TalkLeft13PoliticalWire14TalkingPointsMemo15Matthew Yglesias16WashingtonMonthly17MyDD18JuanCole19Left Coaster20BradfordDeLong
21 JawaReport22VokaPundit23Roger LSimon24TimBlair25Andrew Sullivan26 Instapundit27BlogsforBush28 LittleGreenFootballs29BelmontClub30Captain’sQuarters31Powerline32 HughHewitt33 INDCJournal34RealClearPolitics35Winds ofChange36Allahpundit37MichelleMalkin38WizBang39Dean’sWorld40Volokh(C)
(B)
(A) A) all citations between A-list blogs in 2 months preceding the 2004 election
B) citations between A-list blogs with at least 5 citations in both directions
C) edges further limited to those exceeding 25 combined citations
Example: political blogs(Aug 29th – Nov 15th, 2004)
only 15% of the citations bridge communities
![Page 15: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/15.jpg)
Affiliation networks otherwise known as
membership network e.g. board of directors
hypernetwork or hypergraph bipartite graphs interlocks
![Page 16: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/16.jpg)
m-slices transform to a one-mode network weights of edges correspond to number of affiliations in
common m-slice: maximal subnetwork containing the lines with a
multiplicity equal to or greater than m
A =
1 1 1 1 01 1 1 1 01 1 2 2 01 1 2 4 10 0 0 1 1
1 1
1 2
1
2 slice
1-slice
![Page 17: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/17.jpg)
Pajek:
Net>Transform>2-Mode to 1-Mode> Include Loops, Multiple Lines
Info>Network>Line Values (to view)
Net>Partitions>Valued Core>First threshold and step
![Page 18: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/18.jpg)
Scottish firms interlocking directorateslegend: 2-railways4-electricity5-domestic products6-banks7-insurance companies8-investment banks
![Page 19: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/19.jpg)
methods used directly on bipartite graphs rare
Finding bicliques of users accessing documents An algorithm by Nina Mishra, HP Labs
Documents Users
![Page 20: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/20.jpg)
Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance
Roger Guimera, Brian Uzzi, Jarrett SpiroLuıs A. Nunes AmaralScience, 2005
astronomy andastrophysics
social psychology
economics
![Page 21: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/21.jpg)
Issues in assembling teams Why assemble a team?
different ideas different skills different resources
What spurs innovation? applying proven innovations from one domain to another
Is diversity (working with new people) always good? spurs creativity + fresh thinking but
conflict miscommunication lack of sense of security of working with close collaborators
![Page 22: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/22.jpg)
Parameters in team assembly
1. m, # of team members2. p, probability of selecting individuals who already belong
to the network3. q, propensity of incumbents to select past collaborators
Two phases giant component of interconnected collaborators isolated clusters
![Page 23: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/23.jpg)
creation of a new team
incumbents (people who have already collaborated with someone)
newcomers (people available to participate in new teams)
pick incumbent with probability p if incumbent, pick past collaborator with probability q
![Page 24: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/24.jpg)
Time evolution of a collaboration network
newcomer-newcomer collaborationsnewcomer-incumbent collaborationsnew incumbent-incumbent collaborationsrepeat collaborations
after a time of inactivity, individuals are removed from the network
![Page 25: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/25.jpg)
BMI data Broadway musical industry
2258 productions from 1877 to 1990 musical shows performed at least
once on Broadway team: composers, writers,
choreographers, directors, producers but not actors
Team size increases from 1877-1929 the musical as an art form is still
evolving After 1929 team composition
stabilizes to include 7 people: choreographer, composer, director,
librettist, lyricist, producer
![Page 26: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/26.jpg)
Collaboration networks 4 fields (with the top journals in each field)
social psychology (7) economics (9) ecology (10) astronomy (4)
impact factor of each journal ratio between citations and recent citable items published
A= total cites in 1992 B= 1992 cites to articles published in 1990-91 (this is a subset of A) C= number of articles published in 1990-91 D= B/C = 1992 impact factor
![Page 27: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/27.jpg)
size of teams grows over time
![Page 28: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/28.jpg)
degree distributionsdata
data generated from a model with the same p and q and sequence of team sizes formed
![Page 29: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/29.jpg)
Predictions for the size of the giant component higher p means already published individuals are co-
authoring – linking the network together and increasing the giant component
S = fraction of network occupied by the giant component
![Page 30: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/30.jpg)
Predictions for the size of the giant component(cont’d)
increasing q can slow the growth of the giant component – co-authoring with previous collaborators does not create new edges
![Page 31: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/31.jpg)
network statistics
Field teams individuals p q fR S (size of giant component)
BMI 2258 4113 0.52 0.77 0.16 0.70
social psychology
16,526 23,029 0.56 0.78 0.22 0.67
economics 14,870 23,236 0.57 0.73 0.22 0.54
ecology 26,888 38,609 0.59 0.76 0.23 0.75
astronomy 30,552 30,192 0.76 0.82 0.39 0.98
what stands out?what is similar across the networks?
![Page 32: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/32.jpg)
different network topologies
economics
astronomy
ecology
![Page 33: SI 614 Community structure in networks](https://reader034.vdocuments.us/reader034/viewer/2022051421/56816345550346895dd3d5c4/html5/thumbnails/33.jpg)
main findings all networks except astronomy close to the “tipping” point
where giant component emerges sparse and stringy networks
giant component takes up more than 50% of nodes in each network
impact factor (how good the journal is where the work was published) p positively correlated
going with experienced members is good q negatively correlated
new combinations more fruitful S for individual journals positively correlated
more isolated clusters in lower-impact journals
ecology, economics,social psychology
ecologysocial psychology