analysis of heterogeneous patterns and social dynamics in human
Post on 12-Feb-2022
2 Views
Preview:
TRANSCRIPT
Analysis of Heterogeneous Patterns andSocial Dynamics in Human
Communication
A Senior Project submitted toThe Division of Science, Mathematics, and Computing
ofBard College
byZhechao Zhou
Annandale-on-Hudson, New YorkApril, 2009
Abstract
Vertex connectivities in complex networks are widely recognized to display a scale-free power-law distribution generated by network growth and node preferential attach-ment. Observation of the power-law distribution of tie weights together with the over-representation of network motifs in social networks suggests the existence of non-trivialnetwork mechanisms beyond node level. This project investigates the existing social net-work theories and proposes a stochastic model for weighted scale-free networks in whichhigh-weight ties are more likely to grow using random walk as an approach. The modelreproduces the observed scale-free weight distributions from a pure dyad perspective andconcavity of the distribution curve as the consequence of triad embeddedness. Computersimulations of both processes confirm the theoretical predictions with the statistical anal-ysis of a communication network among millions of individuals. This project, as part of anongoing NSF-funded social network study, contributes to the dynamic analysis of processesof tie formation, stability and change at the dyad and triad level and leads to a system-atic understanding of the fundamental processes and interdependencies underlying socialnetwork dynamics that has deep implications in large-scale human interaction behavior.
Contents
Abstract 1
Dedication 5
Acknowledgments 6
1 Introduction 7
2 Preliminaries 102.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.2 Theories, models and mechanisms . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 A Brief Introduction of Social Balance Theory . . . . . . . . . . . . 112.2.2 Other Relevant Factors . . . . . . . . . . . . . . . . . . . . . . . . . 122.2.3 Master Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Continuous-Time Random Walks and Waiting Time 17
4 Dyadic Dynamics 21
5 Triadic Dynamics 28
6 Conclusion and Future Work 33
7 Appendix 357.1 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.1.1 Dyad process simulation . . . . . . . . . . . . . . . . . . . . . . . . . 357.1.2 Triad process simulation . . . . . . . . . . . . . . . . . . . . . . . . . 37
Contents 3
7.1.3 Triad weight distribution code . . . . . . . . . . . . . . . . . . . . . 407.2 Other results in triads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Bibliography 46
List of Figures
2.1.1 16 types of triads. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.1 In and out degree distribution of the nodes in the data. . . . . . . . . . . . 15
3.0.1 Waiting time distribution generated from 6 week cell phone records. . . . . 19
4.0.1 Distribution of dyad weights from the data. . . . . . . . . . . . . . . . . . . 254.0.2 The comparison of a computer-simulated weight distribution and the real-
world data. The simulation is generated for 5000 nodes, ε = 0.01, overT = 200 time steps and m = 2× 106 every step. . . . . . . . . . . . . . . . . 26
4.0.3 Other potential evolving probability function trial results: w2
t2(yellow), w
0.4
t
(light blue), w0.7
t (pink), w0.9
t (blue), wt (red) compared with the real data
in green. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.0.1 P(a,b,c)-abc plot. P(a,b,c) spans over the 1.1–1.5 slope range. . . . . . . . . 295.0.2 The comparison of a computer-simulated weight distribution based on the
dyad, triad mechanisms and the real-world data. The triadic simulation isgenerated for 500 nodes, ε = 0.01, δ = 0.0012, over T = 350 time steps andm = 75000 every step. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7.2.1 Triad weight distribution from the data. The three axes represents theweights of the three combining dyads. The color scale indicates log(frequency). 44
7.2.2 Triad weight distribution from the simulation. The three axes represents theweights of the three combining dyads. The color scale indicates log(frequency). 45
Acknowledgments
I would like to express my sincerest gratitude to all the those who helped me in completingthis thesis. I want to thank my advisors Matthew Deady and Gregory Landweber for theirefforts in learning and understanding the material of this project despite most of it beingnew to them. I am grateful to Matthew for his academic advising not only over the courseof the senior year, but also every step along my Bard experience. Greg must also be dulythanked here for his assistance with computer programming. The mathematical clarity ofthis project benefits from his comments. I also thank Christian Bracher for being on myboard and giving feedbacks.
I would like to extend my gratitude to Zoltan Toroczkai of University of Notre Dame,who makes this cool project possible, for teaching me how to do research and being in-credibly nice. He is and will remain the knowledgable network guru in my mind.
Burton Brody is not to be forgotten here for sharing his experience, knowledge andgood sense of humor. I wish I could have more classes with him.
There are many others, but they know who they are.
1Introduction
Complex networks describe a wide range of systems of high technological and intellectual
importance. For example, the Internet is a complex network of routers and computers
connected by physical or wireless links; the cell is best described as a complex network
of chemicals connected by chemical reactions; opinions and ideas are spread through the
social network, whose nodes are human beings and whose edges consist of various social
relationships. These systems represent just a few of many complex networks that prompted
the scientific investigation into the mechanism that determines the topology of complex
networks in recent years.
Traditionally, the study of complex networks has been the territory of graph theory. In
the 1950s the modeling of large-scale networks as random graphs with no apparent design
principles was first studied by Erdos and Renyi (see [6]). According to the Erdos-Renyi
model, the network starts with N nodes and connects every pair of nodes with probability
p, creating a graph with approximately pN(N − 1)/2 edges distributed randomly. In a
random graph, since the edges are placed randomly, the majority of nodes have approx-
imately the same degree k (the number of connecting neighbors), close to the average
1. INTRODUCTION 8
degree 〈k〉 of the network. The degree distribution of a random graph follows a Poisson
distribution with a peak at 〈k〉.
In the past few years, motivated by the emergence of large databases collected by com-
puters and the increasing computing power available, many new statistical measures on
complex networks have been proposed and conducted in depth. For a large number of
networks, such as the World Wide Web, and protein interaction networks, there exists
strong evidence that the degree distribution has a power-law tail P (k) ∼ k−γ , where
γ is between 2 and 3 for most cases. Such networks are called scale-free networks. The
Barabasi-Albert model (see [3]) presents the most simple self-organizing mechanism of the
scale-free structure. At each time step a new node is added and connected to old nodes
of the network through m new links. The probability of an old node being linked with
the new node is proportional to the total number of the connections with this node. It
establishes P (k) ∼ k−γ at long times and large connectivities.
This discovery of the power-law degree distribution indicates that the topology and
evolution of complex networks are governed by robust non-random organizing principles.
And the tools of statistical mechanics offer an ideal framework for studying complex
networks. The fact that some motifs such as triads and clusters are over-represented in real-
world networks as compared to randomized networks with the same degree distribution
suggests some fundamental and potentially generic features of network dynamics. Yet,
the mechanism responsible for these marked non-random features remains unknown. This
project will be a first attempt to unveil the weighted dyad and triad formulation mechanism
in complex networks with the focus on the discussion of social networks.
In Chapter 2 we first introduce the definitions of the technical terms in this project.
A detailed review of various theoretical and empirical results from the existing journal
literature follows to establish and justify the assumptions made for the model later on in
1. INTRODUCTION 9
this project. In the second section, we introduce the master equation as the major method
employed in this project and discuss the data used to test the theoretical model.
In Chapter 3 we discuss continuous random walks and waiting time density. We will
show the whole calculation to derive the general expression for the probability of walking
n steps in time t and the specific one given the waiting time density function A× t−α from
the empirical data.
In Chapter 4 we set up the basic assumptions of the purely dyadic model, explain the
reasoning behind the calculation and derive the conclusion. We also employ Monte Carlo
simulation to test the result and compare it with the analysis of the real-world data.
In Chapter 5 we discuss the modified model of dyad evolution within the context of local
communities. We will explain the extra assumption being made and its relationship with
social balance theory. The result of the triadic process is presented numerically through
computer simulation.
2Preliminaries
2.1 Definition
In network science, a node is an agent in the network. A useful way to think about
the network is as a structure revealed through undirectional or directional dyadic ties
(edges) among two nodes. The degree of a node is the number of edges that are incident
with it, or the number of nodes adjacent to it. A dyad in sociology is used to describe
the relationship of two nodes. The dyads may carry weights to show relationship types,
frequency, tie strengths or some other dyadic properties.
Triads consist of three nodes and the possible dyadic relationships between any two
of them. A dyad is usually embedded in more than one triad. Social scientists classify
triads into 16 different types regardless of node combinations [16]. In this project, we
use a generalized definition of the triads as the combination of the three weighted dyads
between any three nodes in the network.
The clustering coefficient of a node in a graph quantifies how close the node and its
neighbors are to being a clique (complete graph). The clustering coefficient is a direct
concept of how a single node is organized into a triad. Suppose a selected node i in the
2. PRELIMINARIES 11
Figure 2.1.1. 16 types of triads.
network has ki edges which connect it to ki other nodes. The clustering coefficient is the
ratio between the number Ei of edges that actually exist between these ki nodes and the
total possible ki(ki − 1)/2 edges, Ci = 2Eiki(ki−1) . The clustering coefficient for the whole
system is given as the average of the clustering coefficient for each node.
2.2 Theories, models and mechanisms
2.2.1 A Brief Introduction of Social Balance Theory
In real world social networks, social cliques are often observed. People who share interests,
views, purposes, patterns of behavior, or ethnicity tend to formulate strong ties with the
other members in the groups which in turn shape their actions. Therefore some motifs
such as triads and clusters are over-represented in real-world networks as compared to
randomized networks with the same degree distribution [13]. And some triadic congura-
tions such as the non-transitive uni-direction triad 030C are very rare in the networks
[17]. Although the Barabasi-Albert model has successfully explained the scale-free nature
of many networks on node level, the empirical evidence of a big number of triads well above
2. PRELIMINARIES 12
random expectations in the vast majority of real networks has brought the multi-agent
network dynamics beyond node level into attention.
Triads in complex networks are recognized as the building blocks in the composition of
recurring subgraphs, closely related to the large-scale organization of complex networks.
Heider [8] first developed the basic components of structural balance theory as a framework
for studying the structural arrangement in social networks. An edge ij is positive if nodes
i and j are friends, and negative if i and j are enemies. The triad ijk is balanced if the sign
of the product of the links in the triad is positive. A balanced triad fulfills the following
adage: a friend of my friend is my friend; an enemy of my friend is my enemy; a friend
of my enemy is my enemy; an enemy of my enemy is my friend. A network is balanced if
each constituent triad is balanced. Cartwright and Harary [5] generalized the definition of
a balanced network to that every cycle in the network is balanced. The cycle-based and
triad-based definitions of balance are demonstrated to be equivalent on complete graphs.
Antal [1],[2] presented a simple rate-equation model for social dynamics in which both
friendly and unfriendly links evolve following the assumption that people naturally tend
to avoid imbalanced relationships. The study shows that a balanced network tends to be
more stable while unbalanced networks tend to be unstable. A finite network always falls
into a socially-balanced absorbing state where no imbalanced triads remain after certain
amount of time. Balance theory predicts that triads tend to form a fully reciprocal closure
or dissolve. The resulting social networks display a strong local community structure and
relatively weak inter-community connections.
2.2.2 Other Relevant Factors
Tie formulation (meaning two nodes establish a new connection) and tie persistence (mean-
ing the established connections continue to exist or strengthen by increasing weights) are
not only subject to social balance forces but also to other factors. Hidalgo and Rodriguez-
2. PRELIMINARIES 13
Sickert study the coupling between the structure of the network and the temporal stability
of its links [9]. It shows that persistent links tend to be reciprocal and more common for
people with low degrees and high clustering coefficients from ten panels of mobile phone
data of a year. Reciprocity tends to be the strongest predictor of tie persistence in social
networks among other dyadic characters.
The probability that two nodes are connected by a link follows a gravity model, i.e.
decreasing as geographical distance squared [11]. Communication triangles are not only
composed of geographically adjacent nodes but they may extend over long distances.
Shorter links have higher probabilities of belonging to triangles than longer links. Local
groups are characterized by short communication and high clustering coefficients, and
long-distance groups features longer communication and low clustering coefficients. In
a communication network the majority of the strong ties are found within the clusters,
indicating that users spend most of their on-air time talking to members of their immediate
circle of friends. In contrast, most links connecting different communities are significantly
weaker than the links within the communities [14].
2.2.3 Master Equation
The notion of a random process grows from an attempt to describe microscopically complex
processes by statistical equations of evolution. A mathematical model for the random
evolution of a memory-less system, that is, one for which the likelihood of a given future
state, at any given moment, depends only on its present state, and not on any past states,
is called a Markov process. In statistical physics a master equation is widely used to
describe a Markov process. The master equation gives the rate of change of the continuous
probability function P (n, t) due to transitions into the state n from all other states and
due to transitions out of state n into all other states, [15]
∂P (n, t)∂t
=∑m=1
[P (m, t)wm,n(t)− P (n, t)wn,m(t)],
2. PRELIMINARIES 14
where wm,n(t) is the probability of a transition from state m to state n during the time
interval ∆t.
2.3 Data
Although individual human behavior is rather complex and unpredictable, there are quan-
titative measures characteristic to large collectives of humans that are reproducible and
thus subject to measurement. In some sense, a large population is similar to an ideal gas
system in which the measurement and modeling on a single molecule’s movement is much
harder than on the macro-scale of the whole system. In the past, social scientific research
on human behavior and social networks has been rather limited due to lack of high-quality
large-scale data and quantitive methods to analyze such data. In general, data obtained
through surveys is usually small, non-longitudinal (independent of time) and biased by
the existence of the observer and the methods in the experiment.
The data in this project are generated from a nationwide cell-phone network of about 6.7
million users in an industrialized country. The operation of cell-phone systems generates
enormous amounts of information on human activity patterns. The customer bases of cell
phone companies range from hundreds of thousands to millions, therefore such large-scale
data sets can provide exceptional statistics on social networks and human dynamics. The
second advantage of this type of data is that it is purely observational; therefore, it is
completely void of any observational bias.
The social network is a structure revealed through dyadic communication among peo-
ple. If A and B have communicated with each other (through voice calls, SMSs) within
a given time frame Tw, then we assume that a tie exists between them. The frequency of
communication interactions between them defines tie strength. Repeating this construc-
tion process with all pairs, one obtains the full social network of the population during a
particular time window. In sum, the full social network in a given time period can be de-
2. PRELIMINARIES 15
1
10
100
1000
10000
100000
1e+06
1e+07
1 10 100 1000 10000
"in_degree.txt""out_degree.txt"
Figure 2.3.1. In and out degree distribution of the nodes in the data.
scribed by the nodes (users), and the tie strengths (captured by frequency measures). The
analysis of this network obtained from a given time frame will provide us with extensive
cross-sectional information, while the study of the time evolution of this network will give
us unprecedented longitudinal information.
The data include information of the caller, callee, type (voice calls or SMSs), number
of calls between the caller and callee, and total duration in seconds. In total, 24,441,476
entries are collected of all the communication events in the network. The distribution for
the degree has a fat tail slowly decreasing to zero with some out-degree noise near the end
of the tail (see Figure 2.3.1). So the network is a scale-free network.
The mobile network captures only a subset of all interactions between individuals, a
detailed mapping of which would include face-to-face, e-mail, and land line communica-
tions as well. Yet although mobile phone data capture just a slice of total communications,
research on media multiplexity suggests that the use of one medium for communication
2. PRELIMINARIES 16
between two people implies communications by other means as well [7]. Moreover, because
of the absence of cell-phone directory listings, these data effectively map the network of
trusted interactions from the overall social network. Therefore the data can be used as a
proxy of the communication network between individuals.
3Continuous-Time Random Walks and WaitingTime
A random walk is a mathematical formalization of a trajectory that consists of taking
successive random steps. An elementary example of a random walk is the binomial random
walk along the integer line, which starts at S0 = 0 and at each step moves by ±1 with
equal probability. The steps are statistically independent. Sn is the position of the walker
on the line after n step. It is not hard to see that the expectation value E(Sn) = 0 and
E(S2n) = n.
We consider a random walk for which the times Tj between successive steps are in-
dependent, identically distributed random variables, with a common probability density
function ψ(t), called the waiting-time density. Suppose that we allow a continuous-time
random walk to begin at time t = 0. Let ψn(t) denote the probability density function for
the time at which the nth step occurs, so that ψ1(t) = ψ(t). From the independence of
the times between steps, we have the recurrence relation
ψn(t) =∫ t
0ψ(τ)ψn−1(t− τ)dτ. (3.1)
The convolution integral in Equation 3.1 may be factorized by applying the Laplace trans-
formation. The Laplace transformation of the function φ is
3. CONTINUOUS-TIME RANDOM WALKS AND WAITING TIME 18
φ(s) = L [φ(t)] =∫ ∞
0e−stφ(t)dt. (3.2)
Applying the Laplace transformation to Equation 3.1,
ψn(s) = L [ψn(t)]
= L [ψ(t)]L [ψn−1(t)]
= L [ψ(t)]2L [ψn−2(t)]
...
= L [ψ(t)]n
= [ψ(s)]n. (3.3)
Let N(t) denote the number of steps taken up to time t. To have N(t) = n, the walker
has to make n steps in some time interval (0, t′) and then make no further steps in the
time interval (t′, t). Since ψn is the probability density function for the time of occurrence
of the nth step and
Ψ(t) = 1−∫ t
0ψ(t′)dt′ =
∫ ∞t
ψ(t′)dt′ (3.4)
is the probability that a walker arriving at a site pauses for at least time t, it follows that
Pr{N(t) = n} =∫ t
0ψn(t′)Ψ(t− t′)dt′. (3.5)
By Equation 3.3 we know ψn(s) = [ψ(s)]n, and noting that
Ψ(s) = L [Ψ(t)] = s−1[1− ψ(s)], (3.6)
we find that
L [Pr{N(t) = n}] =ψ(s)n[1− ψ(s)]
s. (3.7)
Current models of human activity are based on Poisson processes, and assume that
in a dt time interval an individual engages in a specific actions with probability q dt,
3. CONTINUOUS-TIME RANDOM WALKS AND WAITING TIME 19
10000
100000
1e+06
1e+07
1 10 100
frequ
ency
waiting time in minutes
dataslope -0.95
Figure 3.0.1. Waiting time distribution generated from 6 week cell phone records.
where q is the overall average frequency of the monitored activity. In contrast, there is
increasing evidence that the timing of many human activities, ranging from communication
to entertainment and work patterns, follow non-Poisson statistics, characterized by bursts
of rapidly occuring events separated by long periods of inactivity. The bursty nature of
human behaviour is a consequence of a decision-based queueing process. When individuals
execute tasks based on some perceived priority, the timing of the tasks will be heavy tailed
(see [4]). This model predicts that the time interval between two consecutive actions by
the same individual, called the waiting time, follows an exponential distribution. The
measurement of our data confirms this conclusion (Figure 3.0.1).
In our case, the waiting time probability density function is ψ(t) = A× t−α, with A as
a constant. The Laplace transformation of the function is
L [ψ(t)] = ψ(s) = Asα−1Γ(1− α), (3.8)
3. CONTINUOUS-TIME RANDOM WALKS AND WAITING TIME 20
where Γ(x) is the Gamma function. By Equation 3.7, applying the inverse Laplace trans-
formation, we get
Pr{N(t) = w} = L −1[ ψ(s)n[1− ψ(s)]
s
]= [At1−αΓ(1− α)]w+1
[1
Γ(2 + w − α− wα)− At1−αΓ(1− α)
Γ(3 + w − 2α− wα)
].
(3.9)
This is the probability of having exactly w events within time t given the waiting time
probability density ψ(t) = A×t−α. In the next chapter, we will use this result to construct
our first model on dyadic growth.
4Dyadic Dynamics
Suppose the network has n nodes at time t. Let p(w, ti, t) be the probability that an edge i
introduced at time ti has weight w at time t. If the system has a fixed number of nodes, all
the nodes are introduced at time 0 with no edges (trivial graph). In every step, m edges
out of n(n − 1)/2 edges are randomly selected. We view the weight increasing process
of a selected dyad as a random walk for which the waiting time probability density is
ψ(t) = A× t−α.
By Equation 3.9,
Pr{N(t−1) = w} = [A(t−1)1−αΓ(1−α)]w+1
[1
Γ(2 + w − α− wα)− A(t− 1)1−αΓ(1− α)
Γ(3 + w − 2α− wα)
],
(4.1)
and
Pr{N(t) = w+ 1} = [At1−αΓ(1− α)]w+2
[1
Γ(3 + w − 2α− wα)− At1−αΓ(1− α)
Γ(4 + w − 3α− wα)
].
(4.2)
4. DYADIC DYNAMICS 22
Thus the possibility for a single event to happen within one time step is approximately
the quotient of the two above,
p ≈ Pr{N(t) = w + 1}Pr{N(t− 1) = w}
= A(t− 1)wα−wtw−wα−α+1Γ(1− α)1
Γ(2+w−α−wα) −At1−αΓ(1−α)
Γ(3+w−2α−wα)
1Γ(1+w−wα) −
A(t−1)1−αΓ(1−α)Γ(2+w−α−wα)
. (4.3)
This is the analytical solution of the possibility for a single event to happen within one
time step. However, we will use a much simpler alternative function p = w/t in the model
to get an analytical weight distribution approximate to the real distribution. We now show
that this function is actually a reasonable assumption for the model.
Suppose w events have happened before time step t. The probability of the next event
happening at time step t+ n, based on the alternative function, is
P (w + 1; t+ n) = P (w; t)(
1− w
t+ 1
)(1− w
t+ 2
). . .
(1− w
t+ n− 1
)w + 1t+ n
, (4.4)
and we have
P (w + 1; t+ n+ 1) = P (w; t)(
1− w
t+ 1
)(1− w
t+ 2
). . .
(1− w
t+ n
)w + 1
t+ n+ 1. (4.5)
By Equation 4.4 and Equation 4.5,
P (w + 1; t+ n)P (w + 1; t+ n+ 1)
=t+ n+ 1t+ n− w
. (4.6)
By iterating Equation 4.6,
P (w + 1; t+ n)P (w + 1; t+ n′)
=t+ n+ 1t+ n− w
t+ n+ 2t+ n+ 1− w
· · · t+ n′
t+ n′ − 1− w. (4.7)
Since we can arbitrarily set the time when the w-th events happens to be zero, Equation
4.7 becomesP (w + 1;n)P (w + 1;n′)
=(n+ 1)(n+ 2) · · ·n′
(n− w)(n+ 1− w) · · · (n′ − 1− w). (4.8)
When w = 0,P (w + 1;n)P (w + 1;n′)
=(n+ 1)(n+ 2) · · ·n′
n(n+ 1) · · · (n′ − 1)=( nn′
)−1. (4.9)
4. DYADIC DYNAMICS 23
When w = 1,
P (w + 1;n)P (w + 1;n′)
=(n+ 1)(n+ 2) · · ·n′
(n− 1)n · · · (n′ − 2)∼( nn′
)−2. (4.10)
In general,
P (w + 1;n)P (w + 1;n′)
∼( nn′
)−(w+1), (4.11)
where P (w;n) can be interpreted as the waiting time distribution between w-th event and
(w + 1)-th event. The exponential form of Equation 4.11 is consistent with the observed
scale-free waiting time distribution. Therefore wt can serve as a simplification of Equation
4.3.
Back to the model, at time t, if the weight wi of a selected edge i is not zero, wi increases
by 1 with a probability wit following the previous argument; otherwise it stays the same.
If wi = 0, the probability of evolving is εt where ε is some real number between 0 and 1.
Consequently the master equation governing p(w, ti, t) has the form
p(w, ti, t+ 1) ={
w−1t p(w − 1, ti, t) +
(1− w
t
)p(w, ti, t) if w > 1,
εtp(0, ti, t) +
(1− 1
t
)p(1, ti, t) if w = 1.
(4.12)
It is not always easy to measure the newly formed nodes in a large network. In the general
case, the proportion of weight w edges in the whole system P (w, t) at time t is directly
measurable and thus we are more interested in solving P (w, t). Since 0 < P (w, t) < 1, for
mathematical convenience, let
P (w, t) =1t
∑ti
p(w, ti, t). (4.13)
Summing Equation 4.12 up from ti = 1 to ti = t, when w > 1, we get
(t+ 1)P (w, t+ 1)− p(w, t+ 1, t+ 1) =w − 1t
tP (w − 1, t) +(
1− w
t
)tP (w, t).
Reducing the right side of the equation,
(t+ 1)P (w, t+ 1)− p(w, t+ 1, t+ 1) = (w − 1)P (w − 1, t) + (t− w)P (w, t). (4.14)
4. DYADIC DYNAMICS 24
Newly added nodes have no edges to any other nodes in the network. When w is not zero,
p(w, t+ 1, t+ 1) = 0. Assuming the limit P (w) = P (w, t −→∞) exists and t[P (w, t+ 1)−
P (w, t)] −→ 0 as t −→∞, we get
(w + 1)P (w) = (w − 1)P (w − 1). (4.15)
When w = 1, following the similar steps above,
(t+ 1)P (1, t+ 1)− p(1, t+ 1, t+ 1) = εP (0, t) + (t− 1)P (1, t).
For t −→∞,
2P (1) = εP (0). (4.16)
If P (1) is some constant C1,
P (w) =2C1
w(w + 1)=
εP (0)w(w + 1)
∼ w−2. (4.17)
The exponent measured from real data is about −1.95 (see Figure 4.0.1). To verify the
model, we used a Monte Carlo simulation to generate the weight distribution following
the assumptions of the model and compared it with the real-world data (see Figure 4.0.2,
for code see 7.1.1). For every time step in the simulation process, a fixed number of node
pairs are randomly picked from all the nodes. For each of these pairs, we compare the ratio
of the existing weights between the two nodes to the maximum possible weights w = t at
time step t with a generated random number between 0 and 1 to determine whether the
weight of this pair increases by 1.
So far the prediction of the dyad model is roughly consistent with the measurement
from the real data. However, the dyad model has its deficiencies that lead us to the next
chapter on triadic dynamics.
4. DYADIC DYNAMICS 25
1
10
100
1000
10000
100000
1e+06
1e+07
1 10 100 1000
P(w)
w
dataslope -1.95
Figure 4.0.1. Distribution of dyad weights from the data.
4. DYADIC DYNAMICS 26
100
1000
10000
100000
1e+06
1e+07
1 10 100
P(w
)
w
datasimulation
slope -2
Figure 4.0.2. The comparison of a computer-simulated weight distribution and the real-world data. The simulation is generated for 5000 nodes, ε = 0.01, over T = 200 time stepsand m = 2× 106 every step.
4. DYADIC DYNAMICS 27
1
10
100
1000
10000
100000
1e+06
1e+07
1 10 100
P(w)
w
Figure 4.0.3. Other potential evolving probability function trial results: w2
t2(yellow), w0.4
t
(light blue), w0.7
t (pink), w0.9
t (blue), wt (red) compared with the real data in green.
5Triadic Dynamics
In real world networks, motifs such as triads and clusters are over-represented as compared
to randomized networks with the same degree distribution, which we also observe from
our data. If the triads in the network are purely random combinations of the weighted
dyads, then the possibility of a triad containing three weighted dyads a,b,c is
P (a, b, c) = P (a)P (b)P (c) = (abc)−γ .
P (a, b, c) should follow a scale-free distribution with the same slope −γ. However P (a, b, c)
in Figure 5.0.1 from the real data is inconsistent with this expectation. This indicates that
the weighted dyads are organized into the community structure following some non-trivial
mechanisms.
Sociological theories that address the dynamics of social ties postulate that individual
commonalities cause tie formation or rational people build up ties for strategic reasons
to maximize utility, such as networking with popular (high-degree) people. Other theories
suggest that processes of dyad formation and persistence are not only a function of purely
node characters and dyadic factors but are also affected by the embeddedness of dyads in
triadic congurations from a balance view (see 2.2.1 for social balance theory). For example,
5. TRIADIC DYNAMICS 29
1
10
100
1000
10000
100000
1e+06
1e+07
1 10 100 1000 10000 100000 1e+06
P(ab
c)
abc
dataslope -1.5slope -1.1
Figure 5.0.1. P(a,b,c)-abc plot. P(a,b,c) spans over the 1.1–1.5 slope range.
job hunters network with the employees in the industry for advice on their credentials.
They may also build up the relationships in order to be introduced to the third parties
who make decisions in hiring. Moreover, the strength of a tie between two nodes increases
with the overlap of their friendship circles in general [14]. The overlap of friendship circles
may imply that the two nodes belong to the same community. The real explanation of the
dyad formation process may be a combination of the purely dyadic view on dyad factors
(such as reciprocity and strength) and the global view on dyadic embeddedness in triads
and even higher-order network structures.
Therefore following the rationale above, we modify the model in the previous chapter
by adding one more assumption: two nodes are more likely to form or maintain a tie
if they have at least one common friend, a third node that is connected with both. To
quantify a relationship, we label a tie as +1 if the two nodes are friends, −1 if they are
enemies following the notation used in social balance theory. A triad ABC is balanced if
5. TRIADIC DYNAMICS 30
the sign of the product of the three ties equals +1. If the cell phone calling records capture
mostly positive ties in the network (wAB > 0 implying sAB = +1), the new assumption
is equivalent to the general balancing rules of social triads. For a triad of three nodes A,
B, C, if sAB = +1, sAC = +1 and sBC = −1, B and C should have a high probability to
become friends (sBC = +1) in order to balance the triad according to the theory. Similarly,
the modified model states that it is likely to observe tie BC from the data given non-zero
wAB and wAC . On the other hand, the balancing theory suggests that sAB or sAC might
turn negative to balance the triad. In the modified model, wAB and wAC are less likely
to increase, which implies sAB = −1 or sAC = −1 if wBC is zero. Moreover, our model
is able to study the “positiveness” of the ties quantitatively compared to the qualitative
notation in the existing social balance models.
Let p(w, ti, t) be the probability that an edge i introduced at time ti has weight wi at
time t. In every step, m edges out of n(n− 1)/2 edges are randomly selected. At time t, if
the weight wi of a selected edge i is not 0, wi increases by 1 following a new probability
density functionwi + δ
∑c(wj + wk)t
, (5.1)
where∑
c(wj + wk) is the sum of the dyad weights between A and B, the two nodes of
edge i, and their common friends, respectively. If C is a common friend of A and B, the
tie weights wj between A and C and wk between B and C are both non-zero and included
in the sum. δ is the coefficient that measures the contribution of triads to the dyad. If
wi = 0 and∑
c(wj + wk) is also zero (A and B have no common friend), the probability
of evolving is εt where ε is some real number between 0 and 1. Consequently the master
equation governing p(wi, ti, t) has the form
p(wi, ti, t+1) =
wi−1+δ
Pc(wj+wk)t p(wi − 1, ti, t) +
(1− wi+δ
Pc(wj+wk)t
)p(wi, ti, t) if wi > 1,
δPc(wj+wk)t p(0, ti, t) + (1− 1+
Pc(wj+wk)t )p(1, ti, t) if wi = 1 and
∑c(wj + wk) 6= 0,
εtp(0, ti, t) +
(1− 1
t
)p(1, ti, t) if wi = 1 and
∑c(wj + wk) = 0
(5.2)
5. TRIADIC DYNAMICS 31
Equation 5.2 is not analytically solvable due to its nonlinearity. We use computer sim-
ulation to predict the outcome of the triad mechanism displayed in Figure 5.0.2 (see code
7.1.2). For every time step in the simulation process, a fixed number of node pairs are
randomly picked from all the nodes. For each of these pairs, we examine the existing com-
mon friends of each pair, and add the discounted extra weights from common friends to
the existing weights between the two nodes. We compare the ratio of the adjusted total
weights between the two nodes to the maximum possible weights w = t at time step t
with a generated random number between 0 and 1, or with ε if the adjusted total weights
are zero, to determine whether the weight of this pair will increase by 1.
The dyad weight distribution derived following the triad mechanism is slightly curved
compared with the previous outcome from purely dyadic process, providing a better fit of
the real data. We notice that when δ = 0, the triad process is reduced to the pure dyad
process. The δ value tunes the curvature of the outcome here; the larger δ is, the more
curved the plot will be.
5. TRIADIC DYNAMICS 32
100
1000
10000
100000
1e+06
1e+07
1 10 100
P(w)
w
datadyad simulationtriad simulation
Figure 5.0.2. The comparison of a computer-simulated weight distribution based on thedyad, triad mechanisms and the real-world data. The triadic simulation is generated for500 nodes, ε = 0.01, δ = 0.0012, over T = 350 time steps and m = 75000 every step.
6Conclusion and Future Work
Motivated by our observation on the power-law distribution of tie weights together with
the over-representation of triads in the network, we have proposed a stochastic model for
weighted scale-free networks in which high-weight ties are assigned with higher probability
of growth. In order to derive a plausible probability density function of weight growth, we
discuss continuous random walks and waiting time density which is directly measurable
from the data. We show the whole calculation to derive the general expression for the
probability of walking n steps in time t knowing the waiting time distribution and the
specific one given the function A × t−α from the empirical data. We build up the purely
dyadic model employing an alternative probability function simplified from the analytical
solution we derive in the previous chapter and reproduce the scale-free distribution as
observed from the data of a real-world communication network among millions of users,
P (w) ∼ w−2.
In the last chapter we discuss dyad evolution in the context of local triads. We review
the existing literature of social balance theory and assume that processes of tie formation
and persistence are not only functions of purely dyadic factors but are also affected by
6. CONCLUSION AND FUTURE WORK 34
the embeddedness of dyads in triadic configurations. The modified model incorporates
the weights contributed by common friends in the density function for weight growth and
explain the concavity in the dyad weight distribution curve as confirmed numerically by
Monte Carlo simulation.
This project, as part of an ongoing NSF-funded social network study, contributes to the
dynamic analysis of processes of tie formation, stability and change at the dyad and triad
level and leads to a systematic understanding of the fundamental processes and interde-
pendencies underlying social network dynamics that has deep implications in large-scale
human interaction behavior. Further investigation after this project could be extended to
the possible correlation between reciprocity and transitivity. Reciprocal ties may indicate
strong friendship that would exist for a long time and contribute more to the strengths
of the other dyads within a triad than non-reciprocal ties. Therefore the model could be
further revised for the triads consisting of six weights considering directionality to pro-
duce a more accurate simulation. Motifs of higher orders like quadrads may also influence
network dynamics but the influence is probably less significant than dyads and triads.
In addition, we may consider individual node attributes along with the global-specific pa-
rameters associated with the dyads and triads by weighting sociodemographic information
such as age, gender and race to capture the similarity between nodes which may increase
the likelihood of local community formation.
7Appendix
7.1 Code
7.1.1 Dyad process simulation
// dsim.cpp/******************************************************dsim.cpp is a Monte Carlo simulation code of the dyadicmodel. For every time step in the simulation process, afixed number of node pairs are randomly picked from allthe possible combinations. For each of these pairs, wecompare the ratio of the existing weights between thetwo nodes to the maximum possible weights at this timestep with a generated random number between 0 and 1, orwith epsilon if the existing weights are zero, to determinewhether the weight of this pair increases by 1.
The output is the dyad weight distribution written intoa file after the simulation is done.*******************************************************/
#include <iostream>#include <fstream>#include <math.h>using namespace std;
int main (int argc, char *const argv[]) {ofstream ouf;ouf.open(argv[1]);int const max=5000; //fixed total number of nodes in the networkfloat const epsilon=0.01;int r[50005000]; // the array to store dyad weight information
7. APPENDIX 36
int t=0; //number of iteration timesint w[1000]; //the array to store dyad weight distributionint i,m; //m is the number of the selected dyads every time stepint caller,callee;float poss, den; //poss is a random number between 0 and 1srand((unsigned)time(NULL)); //set the random number seed by system timefor (i=0;i<=50005000;i++) {
r[i]=0;}for (i=1;i<=1000;i++){
w[i]=0;}while (t<200) { //simulation process
m=2000000;while (m>0) {caller=fabs(max*float(rand())/float(RAND_MAX+1));//pick a random callercallee=fabs(max*float(rand())/float(RAND_MAX+1));//pick a random calleeif (caller>callee) {
caller=caller+callee;callee=caller-callee;caller=caller-callee;
}/*since we only use undirected dyads, keep thecaller index smaller than the callee index.*/
if (r[caller*10000+callee]!=0) {/*the caller and callee has a non-zero tie between them.*/
poss=fabs(float(rand())/float(RAND_MAX+1));/*generate a uniformly distributed random numberbetween 0 and 1 from the build-in generator*/
den=float(r[caller*10000+callee])/float(t);// using w/t as the probability density function
if (poss<den) {r[caller*10000+callee]++;
}}else { /*the caller and callee has no existing tie between them*/
poss=fabs(float(rand())/float(RAND_MAX+1));if (poss<epsilon) {r[caller*10000+callee]++;
}}m--;
}t++;
}for (i=0; i<=50005000; i++) //find the dyad weight distribution
7. APPENDIX 37
{if ((r[i]!=0)&&(r[i]<=1000)) w[r[i]]++;
}for (i=1;i<=200;i++) //output{
ouf<<i<<’ ’<<w[i]<<endl;}return 0;
}
7.1.2 Triad process simulation
// tsim.cpp/******************************************************tsim.cpp is a Monte Carlo simulation code of the triadicmodel. For every time step in the simulation process, afixed number of node pairs are randomly picked from allthe possible combinations. For each of these pairs, weexamine the existing common friends of the pair, andadd the discounted extra weights from common friendsto the existing weights between the two nodes. We comparethe ratio of the adjusted total weights to the maximumpossible weights at this time step with a generated randomnumber between 0 and 1, or with epsilon if the adjustedtotal weights are zero, to determine whether the weight ofthis pair increases by 1.
The output is the dyad weight distribution file and thedetailed dyad record containing caller, callee, weightafter the simulation is done.*******************************************************/#include <iostream>#include <fstream>#include <math.h>using namespace std;
struct node{//linked list nodeint index;int weight;node *next;
};
void insertNode(node **head, int aData, int w){ //insert the data to the listnode *p,*a,*b;a=(node*)new(node);a->index=aData;a->weight=w;a->next=NULL;p=*head;
7. APPENDIX 38
if (p==NULL){
*head=a;}else{
if (p->index>aData) {a->next=p; *head=a;}else {while ((p->index<aData)&&(p->next!=NULL)) {b=p;p=p->next;}if (p->index==aData) {p->weight=p->weight+w; return;}if (p->index>aData) {b->next=a; a->next=p; return;}if (p->next==NULL) {// p comes to the last on the list, p->data is still less than mif (p->index==aData) {p->weight=p->weight+w; return;}else p->next=a;//p reaches the end, p->data < aData, add the new node at the end
}}
}return;
};
int main (int argc, char * const argv[]) {fstream myfile;fstream myfile2;myfile.open(argv[1]);myfile2.open(argv[2]);int const max=500; // fixed total number of nodes from the beginningfloat const epsilon=0.01;node *h[max]; //array of linked list to store the callees informationnode *p,*p1,*previous;int t=0; //number of iteration timesint w[300]; //the array to store dyad weight distributionint i,j,m; // m is the number of selected pairs each stepint caller,callee;int third,extraw; //third is the third member in a triad,
//extraw stores the extra weight from common friendsfloat poss; //poss is a random number between 0 and 1srand((unsigned)time(NULL)); //set the random seed by timefor (i=0;i<=max;i++) {h[i]=NULL;
}for (i=1;i<=300;i++) {
w[i]=0;}while (t<350) { //simulation process
m=75000;while (m>0) {caller=fabs(max*float(rand())/float(RAND_MAX+1)); //pick a random callercallee=fabs(max*float(rand())/float(RAND_MAX+1)); //pick a random calleeextraw=0;
7. APPENDIX 39
j=0;if (caller>callee) {
caller=caller+callee;callee=caller-callee;caller=caller-callee;
}/*since we only use undirected dyads, keep the
caller index smaller than the callee index.*/
p=h[caller];previous=h[callee];while (p!=NULL) {
third=p->index;if (third<callee) { //when the third’s index is less than the callee’sp1=h[third];while ((p1!=NULL)&&(p1->index<callee)) {//search for callee on the third’s listp1=p1->next;
}if ((p1!=NULL)&&(p1->index==callee)) {extraw=extraw+p->weight+p1->weight;
/* find the callee on the third’s list, so the third is a commonfriend of the caller and the callee, add p->weight (caller tothird) and p1->weight (third to callee) to the extraw variable */
}}if (p->index==callee) {j=p->weight;
}if (third>callee) { //when the third’s index is more than callee’sp1=previous;while ((p1!=NULL)&&(p1->index<third)) {//search for third on the callee’s listprevious=p1;p1=p1->next;
}if ((p1!=NULL)&&(p1->index==third)) {extraw=extraw+p->weight+p1->weight;
/* find third on the callee’s list, so the third is a commonfriend of the caller and the callee, add p->weight (caller tothird) and p1->weight (callee to third) to the extraw variable */
}}p=p->next; //move the point to the next node on the caller’s list
}if ((j==0)&&(extraw==0)) {/*the caller and callee has no existing tie between
them and no common friends*/
poss=fabs(float(rand())/float(RAND_MAX+1));if (poss<epsilon) insertNode(&h[caller], callee, 1);
7. APPENDIX 40
}else {
poss=fabs(float(rand())/float(RAND_MAX+1));/*generate a uniformly distributed random number
between 0 and 1 from the build-in generator*/
if (poss<float(j+0.0012*extraw)/float(t)) {/*using (w+discounted extra weight)/t as the
probability density function*/insertNode(&h[caller], callee, 1);
}}m--;
}t++;
}for (i=0; i<=max; i++) //find the dyad weight distribution{
p=h[i];while (p!=NULL){
myfile<<i<<’ ’<<p->index<<’ ’<<p->weight<<endl;// output the record of caller, callee, weight
if ((p->weight<300)&&(p->weight>0)) w[p->weight]++;p=p->next;
}}for (i=1;i<=300;i++) //output dyad distribution{
myfile2<<i<<’ ’<<w[i]<<endl;}myfile.close();myfile2.close();return 0;
}
7.1.3 Triad weight distribution code
// triad_wd.cpp/******************************************************triad_wd.cpp produces the distribution of the threedyad weights within a complete triad. The code iteratesthrough all the pairs of callees of a node to find thethree weights.
The input file should contain formatted information ofcaller, callee, weight. The output is a combination ofthree weights and the count of this type of triad on oneline. The counted weights here is between 1 and 100.*******************************************************/
7. APPENDIX 41
#include <iostream>#include <fstream>#include <math.h>using namespace std;
struct node{//linked list nodeint index;int weight;node *next;
};
void insertNode(node **head, int aData, int w){ //insert the data to the listnode *p,*a,*b;a=(node*)new(node);a->index=aData;a->weight=w;a->next=NULL;p=*head;if (p==NULL){
*head=a;}else{
if (p->index>aData) {a->next=p; *head=a;}else {while ((p->index<aData)&&(p->next!=NULL)) {b=p;p=p->next;}if (p->index==aData) {p->weight=p->weight+w; return;}if (p->index>aData) {b->next=a; a->next=p; return;}if (p->next==NULL) {// p comes to the last on the list, p->data is still less than mif (p->index==aData) {p->weight=p->weight+w; return;}else p->next=a;//p reaches the end, p->data < aData, add the new node at the end
}}
}return;
};
int main (int argc, char * const argv[]) {ifstream inf;int total, c1, c2, c3;int i, j, k, num, prenum, t, m, n, r1, r2, r3, temp=0;//r1, r2, r3 store the three weightsint res;inf.open(argv[1]);ofstream ouf;ouf.open(argv[2]);
7. APPENDIX 42
total = atoi(argv[3]); // total number of nodes in the networknode *h[total]; //array for each node to store the callee informationnode *pi, *pm;int list[6000]; // 6000 is the largest number of callees for one nodeint w[6000];int b[1000000];for (i=0;i<=total;i++) {
h[i]=NULL;}for(i=0;i<=1000000;i++) {
b[i]=0;}while (!inf.eof()){
inf>>c1>>c2>>c3;//data file input. Each line contains caller index, callee index, weight.//make sure the caller index is always less than the callee’s.if (c1!=c2) {insertNode(&h[c1],c2,c3);
}}inf.close();for (i=0;i<=total;i++){
pi=h[i];if (pi!=NULL){num=0;j=0;for (t=0;t<=prenum;t++) //initialize{
list[t]=0;w[t]=0;
}while (pi!=NULL) // store all the callees of caller i into an array{
list[num]=pi->index;w[num]=pi->weight;pi=pi->next;num++;
}prenum=num;while (j<=num-1)// find the relationship between each of the callees in the array;{
m=list[j];r1=w[j];for (k=j+1; k<=num-1; k++){n=list[k]; //m,n are two calleees of i, n>mr3=w[k];
7. APPENDIX 43
r2=0;pm=h[m];if (pm!=NULL) { //searching for n on m’s listwhile ((pm->index<n)&&(pm->next!=NULL)) pm=pm->next;if (pm->index==n) r2=pm->weight;if (pm->next==NULL) {
if (pm->index==n) {r2=pm->weight;}}
}if ((r1<100)&&(r2<100)&&(r3<100)) {if (r1>r2) {r1=r1+r2;r2=r1-r2;r1=r1-r2;}if (r1>r3) {r1=r1+r3;r3=r1-r3;r1=r1-r3;}if (r2>r3) {r2=r2+r3;r3=r2-r3;r2=r2-r3;} //keep r1<r2<r3b[r1*10000+r2*100+r3]++;
}}j++; //move to the next callee on caller i’s list
}}
}for(i=0;i<=1000000;i++) {//output
r3=i%100;res=floor(i/100);r2=res%100;res=floor(res/100);r1=res%100;if ((r1!=0)&&(r1<=r2)&&(r2<=r3)) ouf<<r1<<’ ’<<r2<<’ ’<<r3<<’ ’<<b[i]<<endl;
}ouf.close();return 0;
}
7.2 Other results in triads
We plot the weight distributions of the three dyads in a triad from the real-world dataand the simulation based on the modified model. The triad weight distribution from thedata has certain patterns yet to be figured out. We cannot visually compare the simulationresult with the real data so far since the simulation process demands a large number ofnodes, long running time and large computer storage space to be comparable with the realworld activity.
7. APPENDIX 44
Figure 7.2.1. Triad weight distribution from the data. The three axes represents the weightsof the three combining dyads. The color scale indicates log(frequency).
7. APPENDIX 45
Figure 7.2.2. Triad weight distribution from the simulation. The three axes represents theweights of the three combining dyads. The color scale indicates log(frequency).
Bibliography
[1] T. Antal, P. L. Krapivsky, and S. Redner, Dynamics of Social Balance on Networks,Physica E 72 (2005), 036–121.
[2] , Social balance on networks: The dynamics of friendship and enmity, PhysicaD 224 (2006), 130–136.
[3] A.-L. Barabasi and R. Albert, Emergence of Scaling in Random Networks, Science286 (1999), 509–512.
[4] A.-L. Barabasi, The Origin of Bursts and Heavy Tails in Human Dynamics, Nature435 (2005), 207–211.
[5] D. Cartwright and F. Harary, A generalization of Heider’s theory, Psychological Re-view 63 (1956), 277–292.
[6] Paul Erdos and Alfred Renyi, On Random Graphs, Publ. Math. Debrecen 6 (1959),290–297.
[7] Caroline Haythornthwaite, Social networks and Internet connectivity effects, Informa-tion Communication and Society 8 (2005), 125–147.
[8] F. Heider, Attitudes and cognitive organization, Journal of Psychology 21 (1946),107–112.
[9] Cesar Hidalgo and C. Rodriguez-Sickert, The dynamics of a mobile phone network,Physica A 387 (2008), 3017–3024.
[10] Barry D. Hughes, Random Walks and Random Environments, Oxford UniversityPress, USA, 1995.
[11] Renaud Lambiotte, Vincent D. Blondel, and Cristobald de Kerchove, Geographicaldispersal of mobile communication networks, Physica A 387 (2008), 5317–5325.
[12] Donald A. McQuarrie, Statistical Mechanics, University Science Books, Sausalito, CA,2000.
Bibliography 47
[13] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, NetworkMotifs: Simple Building Blocks of Complex Networks, Science 298 (2002), 824–827.
[14] J.P. Onnela, J. Saramaki, J. Hyvonen, G. Szabo, D. Lazer, K. Kaski, J. Kertesz,and A.-L. Barabasi, Structure and tie strengths in mobile communication networks,Proceedings of the National Academy of Sciences 104 (2007), 7332–7336.
[15] L.E. Reichl, A Modern Course in Statistical Physics, Wiley Publication, New York,NY, 1998.
[16] Stanley Wasserman and Katherine Faust, Social network analysis: methods and ap-plications, Cambridge University Press, New York, NY, 1994.
[17] Zhechao Zhou, Heterogeneous Patterns in Human Communication, Unpublished work(2008).
top related