[ieee 2011 baltic congress on future internet communications (bcfic riga) - riga, latvia...

A novel model for social networks Sreedhar Bhukya

Department of Computer and Information Sciences University of Hyderabad, Hyderabad- 500046, India

Email: [email protected]

Abstract— A number of recent studies on social networks are based on a characteristic which includes assortative mixing, high clustering, short average path lengths, broad degree distributions and the existence of community structure. Here, a model which satisfies all the above characteristics is developed. In addition, this model facilitates interaction between various communities. This model gives very high clustering coefficient by retaining the asymptotically scale-free degree distribution. Here the community structure is raised from a mixture of random attachment and implicit preferential attachment. In addition to earlier works which only considered Neighbour of Initial Contact (NIC) as implicit preferential contact, we have considered Neighbour of Neighbour of Initial Contact (NNIC) also. This model supports the occurrence of a contact between two Initial contacts if the new vertex chooses more than one initial contacts. This ultimately will develop a complex social network rather than the one that was taken as basic reference.

Keywords-component; social networks; complex networks; community structure; novel model; initial contact; secondary contact; tertiary contact; clustering;

I. INTRODUCTION Social networks are made of nodes that are tied by one

or more specific types of relationships. The vertex represents individuals or organizations. Social networks have been intensively studied by Social scientists [2-4], for several decades in order to understand local phenomena such as local formation and their dynamics, as well as network wide process, like transmission of information, spreading disease, spreading rumour, sharing ideas etc. Various types of social networks, such as those related to professional collaboration [5-7], Internet dating [8], and opinion formation among people have been studied. Social networks involve Financial, Cultural, Educational, Families, Relations and so on. Social networks create relationship between vertices; Social networks include Sociology, basic Mathematics and graph theory. The basic mathematics structure for a social network is a graph. The main social network properties includes hierarchical community structure [9], small world property [10], power law distribution of nodes degree [18] and the most basic is Barabasi Albert model of scale free networks [11]. The more online social network gains popularity, the more scientific community is attracted by the research opportunities that these new fields give. Most popular online social networks is Facebook, where user can add friends, send them messages, and update their personal profiles to notify friends about themselves. Essential characteristics for social networks are believed to include assortative

mixing [12,13], high clustering, short average path lengths, broad degree distributions[14,15], and the existence of community structure. Growing community can be roughly speaking set of vertices with dense internal connection, such that the inter community connection are relatively sparse.

Here we have considered an existing model [1] of

social networks and developed it. The model is as follows: The algorithm consists of two growth processes: (1) random attachment (2) implicit preferential attachment resulting from following edges from the randomly chosen initial contacts. The existing model lacks in the following two aspects:

1. There is no connection between initial to initial contacts, if more than one initial contact is chosen.

2. There is no connection between initial contact and its neighbour of neighbour vertices.

These two aspects have considerable effect in the following case. Let us consider a person contacting a person in a research group for his own purpose and suppose that he/she didn’t get adequate support from that person or from his neighbours, but he may get required support from some friend of friend for his/her initial contact. Then the only way a new person could get help is that his primary contact has to be updated or create a contact with his friend of friend for supporting his new contact and introduce his new contact to his friend of friend. The same thing will happen in our day to day life also. If a person contacts us for some purpose and we are unable to help him, we will try to help him by some contacts of our friends. The extreme case of this nature is that we may try to contact our friend of friend for this purpose. We have implemented the same thing in our new model. In the old model, information about friends only used to be updated, where as in our model information about friend of friend also has been updated. Of course this model creates a complex social network model but, sharing of information or data will be very fast. This fulfils the actual purpose of social networking in an efficient way with a faster growth rate by keeping the community structure as it is.

II. NOVEL NETWORK GROWTH ALGORITHM The algorithm includes three processes: (1) Random

attachment (2) Implicit preferential contact with the neighbours of initial contact (3) In addition to the above we are proposing a contact between initial contact to its

2011 Baltic Congress on Future Internet and Communications

978-1-4244-8513-0/11/$26.00 ©2011 IEEE 21

Neighbour of Neighbour (tertiary). The algorithm of the model is as follows [1]:

1. Start with a seed network of N vertices. 2. Pick on average mr ≥ 1 random vertex as

initial contacts. 3. Pick on average ms ≥ 0 neighbours of each

initial contact as secondary contact. 4. Pick on average mt ≥ 1 neighbours of each

secondary contact as tertiary contact. 5. Connect the new vertex to the initial, secondary

and tertiary contacts 6. Repeat the steps 2-4 until the network has

grown to desired size.

Fig.1. Growing process of community network: The new vertex ‘V’ initially connects to random initial contact (say i). Now i, updates its neighbour of neighbour contact list and hence connects to l. ‘V’ connects to ms number of neighbours (say j ,l ) and mt number of neighbour of neighbours of i (say l ).

The major advantage of this model is that there is a possibility of establishing a contact between two initially chosen random contacts in a mean time, where as the earlier model strictly prohibits this contact. For example consider the graph in Fig: 2.

Here let us say a new node N1 selects R1, R2 as its

random initial contacts which are the members of two exclusive communities. In the earlier model there is no way to establish a contact between R1 and R2 at any point of time in future, but in our model if any newly coming node, let us say Nk, chooses either of R1 or R2 as its random initial contact, then immediately there establishes a contact between R1 and R2 via N1.

Below in Fig.3.Visualization of a network graph with N=30. Number of each secondary contact from each initial contact n2nd~U[0,3) (uniformly distributed between 0 and 3), and initial contact is connecting its neighbour of neighbour vertices U[1-2]. In this model we tried 30 sample vertices and prepared a strong growing network.

III. VERTEX DEGREE DISTRIBUTION We derive approximate value for the vertex degree

distribution for growing network model mixing random initial contact, neighbour of neighbour initial contact and neighbour of initial contacts. Power law degree distribution with p (k) ~ kγ with exponent 2<γ<∞ have derived [16,18]. In this model also the lower bound to the degree exponent γ is found to be 3, which is same as in the earlier model

The rate equation which describes how the degree of a vertex changes on average during one time step of the network growth is constructed. The degree of vertex vi grows in 3 processes:

1) When a new vertex directly links to vi at any

time t, there will be on average ~ t vertices. Here we are selecting mr out of them with a probability mr /t.

2) When a vertex links to vi as secondary contact, the selection will give rise to preferential attachment. These will be mr. ms in number.

3) When a vertex links to vi as tertiary contact, this will also be a random preferential attachment. These will be 2mr ms mt in number.

These three processes lead to following rate equation for the degree of vertex vi [1].

k m m +2m m m1 r s r s ti = m + kr it t 2(m +m m +2m m m )r r s r s t

⎛ ⎞⎜ ⎟⎝ ⎠

(1)

Based on the average initial degree of a vertex is k =m +m m +2m m mr r s r s tinit

Separating and integrating (from ti to t, and from kinit to ki ) we will get the following time evaluation for the vertex degrees

1/A1k (t)=B -Ci ti

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

(2)

Where m +m m +2m m mr r s r s tA=2

m m +2m m mr s r s t

⎛ ⎞⎜ ⎟⎝ ⎠

22

1B=A m + m m +m m mr r s r s t2

C=Amr

⎛ ⎞⎜ ⎟⎝ ⎠

From time evolution of vertex ki (t), we can calculate the degrees of distribution p(k) by forming cumulative distribution F(k) and differentiating with respect to k. Since the mean field approximation[1] the degree ki(t) of a vertex vi increases monotonously from the time ti the vertex initially added to the network, the fraction of vertices whose degree is less than ki(t) at t is equivalent to the fraction of vertices that introduced after time ti. Since t is evenly distributed, this fraction is (t-ti)/2. These facts lead to the cumulative distribution [1]

( ) ( ) ( ) ( )1F k = P k k = P t t = t-ti i i it

≤ ≥ (3)

Solving for ( ) ( )Ai i i i

-At = t k , t = B k + C t from (2) and inserting it into (3), differentiating F(ki) with respect to ki, and replacing the notation ki by k in the equation, we get the probability density distribution for the degree k as

( ) ( ) -2 -3AP k = A B k + C m + 2 m ms s t (4)

Here A, B and C are as above. In the limit of large k, the distribution becomes a power law p (k) ~ k-γ with γ =3+2/ms, ms>0, leading to 3<γ<∞. Hence the lower bound to the degree exponent is 3. Although the lower bound for degree exponent is same as earlier model. The probability density distribution is larger compared to earlier model, where the denominator of the first term of degree exponent is larger compared to the earlier model.

IV. CLUSTERING The clustering coefficient on vertex degree can also be

found by the rate equation method [17]. Let us examine how the number of triangles Ei changes with time. The triangle around vi are mainly generated by three processes

1. Vertex vi is chosen as one of the initial contact with probability mr/t and new vertex links to some of its neighbours as secondary contact, giving raise to triangle.

2. The vertex vi is chosen as secondary contact and the new vertex links to it as its primary or tertiary contact giving raise to triangles.

3. The vertex vi is chosen as tertiary contact and the new vertex links to it as its primary or secondary contact giving raise to triangles.

These three process are described by the rate equation [1]

E k 5m m m1 r s ti i= - m -m m -3m m m - kr r s r s t it t t 2(m +m m +2m m m )tr r s r s t

⎛ ⎞⎜ ⎟⎝ ⎠

(5) where second right-hand side obtained by applying Eq. (1) integrating both sides with respect to t, and using initial condition Ei(kinit, ti) =mr ms(1+3mt), we get the time evaluation of triangle around a vertex vi as

( ) ( ) a+bk a+bkt i iE t = a+bk ln + ln +Ei i initt b a+bki init

⎛ ⎞⎛ ⎞ ⎛ ⎞⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠⎝ ⎠ ⎝ ⎠

(6)

Now making use of the previously found dependent of

ki on ti for finding ci(k). solving for ln(t/ti)in terms of ki from (2), inserting into it into (6) to get Ei(ki), and dividing Ei(ki) by the maximum possible number of triangles, ki(ki-1)/2, we arrive the clustering the coefficient

( ) ( )( )

2 E ki ic k =ii k k - 1i i

(7)

( ) ( )

( )

a+bkiE k =Abk ln k +C +k ln -bAk lnB+i i i i i ia+bkinit

a+bka iaAln k +C + ln +E -aAlnBi initb a+bkinit

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

( ) ( )

( ) ( )= D k l n k + C + k F + G k - D k l n B +i i i i i

H l n k + C I l n F + G k + Ji i

Where

( )

a=m m +3m m m -mr s r s rt5m m mr s tb=

2 m +m m +2m m mr r s r s ta

F=a+bkinit

D=Ab

bG=

a+bkinitH=aA

aI=

bJ=E -HlnBinit

For large values of degree k, the clustering coefficient

thus depend on k as c(k)~ln k/k. This has very large clustering coefficient compared to

the earlier work where it was c(k)~1/k

V. RESULTS Here a comparison has been made between the earlier

model and current model by calculating the edge to vertex ratio and triangle to vertex ratio for 30 vertices. The results are given in TABLE: 1and TABLE: 2 .here one can see an enormous increase in secondary contacts. In addition tertiary contacts also have been added in our model, which leads to a faster and complex growth of network.

23

TABLE: 1

Data on Existing model

Initial Contact (IC)

Secondary Contacts (SC) Difference

Vertices 2.53 3.66 1.13

Triangles 0 5.34 5.34

TABLE: 2

SIMULATION RESULTS

The below results have been represented graphically by calculating the degree (number of contacts) of a node. This also is shows an enormous growth in degree of nodes.

Fig.4. Comparison of results of growing network

community: Here initial contacts are growing very slow rate compared to secondary contact i.e. ■ indicates the initial contact, ♦ indicates secondary contacts and ▲ indicates the degree of each vertices (sum of initial and secondary contacts.), when initial and secondary contact connect to a vertex vi are considered, vertices simulation results based on Table: 1.

Fig.5. Comparison of results of growing network

community: initial contacts are growing very slow rate compared to secondary contact i.e. ■ indicates initial contact, ♦ indicates secondary contacts, and ▲ indicates neighbor of neighbor of initial contact connects to the vertex vi, Finally ● indicates degree of each vertices, when initial, secondary and tertiary contact connect to a vertex vi. Our network community is growing very fast and complex when compared to existing model, vertices simulation results based on Table: 2.

VI. CONCLUSION In this paper, a model which reproduces very

efficient networks compared to real social networks has been developed. And also here, the lower bound to the degree exponent is the same. The probability distribution for the degree k is in agreement with the earlier result for mt =0. The clustering coefficient got an enormous raise in growth rate of ln(ki )/ ki compared to the earlier result 1/ki for large values of the degree k. Thus here an efficient but complex model of social network has been developed which gives an enormous growth in probability distribution and clustering coefficient and edge to vertex ratio by retaining the community structure. This model can be used to develop a new kind of social networking among various research groups. TOOL We have used C language, UciNet, NetDraw and Excel for creating graph and simulation. Notations

Notation Description mr Random initial contact ms Secondary contact mt Tertiary contact (neighbour of

neighbour of initial contact)

REFERENCES [1] Riitta Toivonen, Jukka-Pekka Onnela, Jari Sarama¨ki, Jo¨rkki

Hyvo¨nen, Kimmo Kaski “A model for social networks”. Physica A 371 (2006) 851–860

[2] S. Milgram, Psychology Today 2 (1967) 60–67. [3] M. Granovetter, “The Strength of Weak Ties” Am. J. Soc. 78

(1973) 1360–1380. [4] S. Wasserman, K. Faust, Social Network Analysis, Cambridge

University Press, Cambridge, 1994. [5] Duncan J. Watts & Steven H. Strogatz, “Collective dynamics of

`small -world' networks, NATURE 393 (1998) 440. [6] M. Newman, “The structure of scientific collaboration networks”,

PNAS January 17, 2001, vol. 98. 404- 409. [7] M. Newman, “Coauthorship networks and patterns of scientific

collaboration”, PNAS, April6, 2004. Vol. 101. 5200–5205. [8] P. Holme, C R. Edling, Fredrik. Liljeros, “Structure and Time-

Evolution of an Internet Dating Community”, Soc. Networks (2004). Vol 26- 155–174.

[9] M. Girvan and M. E. J. Newman, Community structure in social and biological networks. Proc. Natl. Acad. Sci. USA 99, 7821±7826 (2002).

[10] M. E. J. Newman, The structure and function of complex networks, SIAM Review 45, 167-256 (2003).

[11] A.-L. Barabási and R. Albert, Emergence of scaling in random networks, Science, 286:509-512 (1999).

[12] M.E.J Newman,”Assortative Mixing in Networks”, Phys. Rev. Lett. 89 (2002) 208701.

[13] M.E.J. Newman, Juyong Park, “Why social networks are different from other types of networks”, Phys. Rev. E 68 (2003) 036122.

[14] L.A.N. Amaral, A. Scala, M. Barth and H.E. Stanley, Classes of small-world networks, PNAS.Oct 10, 2000.Vol.97, 11149-11152.

[15] Marian Boguna, Romualdo Pastor-Satorras, Albert Dı´az-Guilera, Alex. Arenas, “Models of social networks based on social distance attachment”, Phys. Rev. E 70, 056122. (2004)

[16] T. Evans, J. Sarama¨ki, “Scale-free networks from self-organization”, Phys. Rev. E 72, 026138. (2005)

[17] Gabor Szabo, Mikko Alava, and Janos. Kertesz, Phys. “Structural transitions in scale-free networks”. Phys. Rev. E 67 056102. (2003)

[18] P. L. Krapivsky and S. Redner “Organization of growing random networks”.phys.Rev.E63,066123(2001)

Data based on our proposed model

Initial Contact (IC)

Secondary Contacts (SC)

Neighbour of Neighbour IC (NNIC)

Vertices 2.53 6.12 3.43

Triangles 0 6.0 6.44

24

[ieee 2011 baltic congress on future internet communications (bcfic riga) - riga, latvia...

Documents