social sub-groups overview substantive papers: wayne baker social structure in a place where there...

23
Social Sub-groups Overview Substantive papers: Wayne Baker •Social structure in a place where there should be none Scott Feld •What causes clustering in a network? Opportunity and interests Methods: •Continue discussion of social subgroups: - Cluster analysis Roles & Blockmodels

Upload: jonathan-hawkins

Post on 05-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Social Sub-groups

OverviewSubstantive papers:Wayne Baker

•Social structure in a place where there should be none• Scott Feld

•What causes clustering in a network? Opportunity and interests

Methods:•Continue discussion of social subgroups: - Cluster analysis

Roles & Blockmodels

Page 2: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Social Sub-groups

Wayne Baker: The Social Structure of a National Securities Market:1) Behavioral assumptions of economic actors2) Micro-structure of networks3) Macro-structure of networks4) Price Consequences

Under standard economic assumptions, people should act rationally and act only on price. This would result in expansive and homogeneous (I.e. random) networks. It is, in fact, this structure that allows microeconomic theory to predict that prices will settle to an optimal equilibrium

Page 3: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Baker’s Model:

Page 4: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Baker’s Model:

He makes two assumptions in contrast to standard economic assumptions:a) that people do not have access to perfect information andb) that some people act opportunistically

He then shows how these assumptions change the underlying mechanisms in the market, focusing on price volatility as a marker for uncertainty.

The key on the exchange floor is “market makers” people who will keep the process active, keep trading alive, and thus not ‘hoard’ (and lower profits system wide)

Page 5: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Baker’s Model:

Micronetworks: Actors should trade extensively and widely. Why might they not?

A) Physical factors (noise and distance)B) Avoid risk and build trust

Macro-Networks: Should be undifferentiated. Why not?

A) Large crowds should be more differentiated than small crowds. Why?

Price consequences: Markets should clear. They often don’t. Why?

Network differentiation reduces economic efficiency, leading to less information and more volatile prices

Page 6: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Baker: Use frequency of exchange to identify the network, resulting in:

Baker finds that the structure of this network significantly (and differentially) affects the price volatility of the network

Page 7: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Baker: Because size is the primary determinant of clustering in this setting, he concludes that the standard economic assumption of large market = efficient is unwarranted.

Page 8: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Scott Feld: Focal Organization of Social Ties

Feld wants to look at the effects of constraint & opportunity for mixing, to situate relational activity within a wider context.

The contexts form “Foci”, “A social, psychological, legal or physical entity around which

joint activities are organized” (p.1016)

People with similar foci will be clustered together. He contrasts this with social balance theory.

Claim: that much of the clustering attributed to interpersonal balance processes are really due to focal clustering.

(note that this is not theoretically fair critique -- given that balance theory can easily accommodate non-personal balance factors (like smoking or group membership) but is a good empirical critique -- most researchers haven’t properly accounted for foci.)

Page 9: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysisAs with the network cluster algorithms, there are many options for clustering. The three that I use most are:

•Ward’s Minimum Variance -- the one I use almost 95% of the time•Average Distance -- the one used in the example above•Median Distance -- very similar

Again, the SAS manual is the best single place I’ve found for information on each of these techniques.

Some things to keep in mind:Units matter. The example above draws together pairs

horizontally because the range there is smaller. Get around this by standardizing your data.

This is an inductive technique. You can find clusters in a purely random distribution of points. Consider the following example.

Page 10: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

data random; do i=1 to 20; x=rannor(0); y=rannor(0); output; end;run;

The data in this scatter plot are produced using this code:

Cluster analysis

Page 11: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis Resulting dendrogram

Page 12: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis Resulting cluster solution

Page 13: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis

Cluster analysis works by building a distance matrix between each pair of points. In the example above, it used the Euclidean distance which in two dimensions is simply the physical distance between the points in a plot.

Can work on any number of dimensions.

To use cluster analysis in a network, we base the distance on the path-distance between pairs of people in the network.

Consider again the blue-eye hip example:

Page 14: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis

Distance Matrix0 1 3 2 3 3 4 3 3 2 3 2 2 1 11 0 2 2 2 3 3 3 2 1 2 2 1 2 13 2 0 3 2 4 3 3 2 1 1 1 2 2 32 2 3 0 1 1 2 1 1 2 3 3 3 2 13 2 2 1 0 2 1 1 1 1 2 2 3 3 23 3 4 1 2 0 1 1 2 3 4 4 4 3 24 3 3 2 1 1 0 2 2 2 3 3 4 4 33 3 3 1 1 1 2 0 1 2 3 3 4 3 23 2 2 1 1 2 2 1 0 1 2 2 3 3 22 1 1 2 1 3 2 2 1 0 1 1 2 2 23 2 1 3 2 4 3 3 2 1 0 1 2 2 32 2 1 3 2 4 3 3 2 1 1 0 1 1 22 1 2 3 3 4 4 4 3 2 2 1 0 2 21 2 2 2 3 3 4 3 3 2 2 1 2 0 11 1 3 1 2 2 3 2 2 2 3 2 2 1 0

Page 15: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

The distance matrix implies a space that nodes are embedded within. Using something like MDS, we can represent the space implied by the distance matrix in two dimensions. This is the image of the network you would get if you did that.

Cluster analysis

Page 16: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysisWhen you use variables, the cluster analysis program generates a distance matrix. We can, instead use the network distance matrix directly. If we do that with this example network, we get the following:

Page 17: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis

Page 18: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysis

In SAS you use two commands to get a cluster analysis. The first does the hierarchical clustering. The second analyzes the cluster output to create the tree.

Example 1. Using variables to define the space (like income and musical taste):

proc cluster data=a method=ave out=clustd std;var x y;id node;run;

proc tree data=clustd ncl=5 out=cluvars;run;

Page 19: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysisExample 2. Using a pre-defined distance matrix to define the space (as in a social network).You first create the distance matrix (in IML), then use it in the cluster program.

proc iml; %include 'c:\moody\sas\programs\modules\reach.mod';

/* blue eye example */

mat2=j(15,15,0); mat2[1,{2 14 15}]=1; /* lines cut here */ mat2[15,{1 14 2 4}]=1;

dmat=reach(mat2); mattrib dmat format=1.0;

print dmat; id=1:nrow(dmat); id=id`;

ddat=id||dmat;

create ddat from ddat; /* creates the dataset */ append from ddat;

quit;

data ddat (type=dist); /* tells SAS it is a distance */ set ddat; /* matrix */run;

Page 20: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

Cluster analysisExample 2. Using a pre-defined distance matrix to define the space (as in a social network).Once you have it, the cluster program is just the same.

proc cluster data=ddat method=ward out=clustd;id col1;run;

proc tree data=clustd ncl=3 out=netclust;copy col1;run;

proc freq data=netclust;tables cluster;run;

proc print data=netclust;var col1 cluster;run;

Page 21: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

The CROWDS algorithm combines the density approach above with an initial cluster analysis and a routine for determining how many clusters are in the network. It does so by using the Segregation index and all of the information from the cluster hierarchy, combining two groups only if it improves the segregation fit for both groups.

.395.341 .319 .254

.404 .185 .614

.197 .372

.394

.279 .238 .224

.370

.325.368 .473.285.171

.589

.679 .496

.398 .255

.387

.701

.402.410

.555 .400

.646

.692

.085.127

.762

.735

.745

.745

Total

Page 22: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

The one other program you should know about is NEGOPY. Negopy is a program that combines elements of the density based approach and the graph theoretic approach to find groups and positions. Like CROWDS, NEGOPY assigns people both to groups and to ‘outsider’ or ‘between’ group positions. It also tells you how many groups are in the network.

It’s a DOS based program, and a little clunky to use, but NEGWRITE.MOD will translate your data into NEGOPY format if you want to use it.

There are many other approaches. If you’re interested in some specifically designed for very large networks (10,000+ nodes), I’ve developed something I call Recursive Neighborhood Means that seems to work fairly well.

Page 23: Social Sub-groups Overview Substantive papers: Wayne Baker Social structure in a place where there should be none Scott Feld What causes clustering in

The Crowds Algorithm1. Identify members of network bicomponents, remove people not included.

2. Cluster the reduced network. - Identify optimal number of groups: (TREEWALK) - For each level of the cluster partition tree do (BFS): -Move up the tree from smaller to larger groups. -If the fit for both groups is improved by joining them then do so. -If not, then identify group at that level. -End TREEWALK.

Do until all groups are identified (GLOBAL LOOP): 3. Evaluate node fit. Do until nodes cannot be moved: For each identified cluster do (GRPCHECK):

- Ensure group is a bi-component. -Calculate effect on group a of moving node j to group a. -Calculate effect on j's present group of removing j. - If there is a positive net gain to moving j from own group to a, then do so. End. 4. Identify Bridging members. -If removing j from group a would improve the fit of group a, AND assigning j to any

other group would lower the fit for that group, then j is considered a bridge. Place all bridges in separate class.

5. Group Check.Check returns to combining groups. IF merging groups would improve the fit of all

groups to be merged, then do so.- Evaluate bridges, to be sure that they are not bridging two groups that have now merged. End Global loop.