diffusion processes on complex networksprac.im.pwr.wroc.pl/~szwabin/assets/diff/2.pdf ·...
TRANSCRIPT
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 1/28
Diffusion processes on complex networks
Lecture 2 - Network properties
Janusz SzwabińskiOverview:
Degrees and their distributionAdjacency matrixPaths and distancesNetwork diameterConnectednessClustering coefficient
In [1]:
%matplotlib inline
In [2]:
import networkx as nx
Degrees and their distributionsdegree of a node represents the number of links the node has to other nodesa key property of each node in a networkin an undirected graph the total number of links can be expressed as the sum of the nodedegrees:
The factor corrects for the fact that in the sum each link is counted twice.
In [3]:
G = nx.barabasi_albert_graph(100,4)
In [4]:
len(G.edges())
L
L =1
2∑i=1
N
ki
1/2
Out[4]:
384
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 2/28
In [5]:
def count_edges(graph):
"""Sums node degrees to count the number of edges"""
count = 0
for n in graph.nodes():
count = count + graph.degree(n)
return count/2
In [6]:
count_edges(G)
In [7]:
H = nx.erdos_renyi_graph(100,0.2)
In [8]:
len(H.edges()) == int(count_edges(H))
Average degreean important property of a whole networkundirected networks:
In [9]:
degrees = dict(G.degree())
kaver = sum(degrees.values())/len(G)
print(kaver)
In [10]:
2*len(G.edges())/len(G)
⟨k⟩ = =1
N∑i=1
N
ki2L
N
Out[6]:
384.0
Out[8]:
True
7.68
Out[10]:
7.68
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 3/28
Directed networks
incoming degree - number of links pointing to node for example, it is the number of WWW pages that include hyperlinks pointing to a givendocument
outgoing degree - number of links that point from node to other nodesnumber of webpages a given document is pointing to (i.e. number of hyperlinks containedin this document)
a node's total degree is then given by
degree and the number of edges:
average degree:
Thus:
In [11]:
D = nx.scale_free_graph(20)
In [12]:
nx.draw(D)
kini i
kouti i
= +ki kini kouti
L = =∑i=1
N
kini ∑i=1
N
kouti
⟨ ⟩ = =kin1
N∑i=1
N
kiniL
N
⟨ ⟩ = =kout1
N∑i=1
N
kouti
L
N
⟨ ⟩ = ⟨ ⟩kin kout
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 4/28
In [13]:
len(D.edges())
In [14]:
count_edges(D)
In [15]:
outdegrees = dict(D.out_degree())
koa = sum(outdegrees.values())/len(D)
print(koa)
In [16]:
indegrees = dict(D.in_degree())
kia = sum(indegrees.values())/len(D)
print(kia)
In [17]:
len(D.edges())/len(D)
Degree distribution
In [18]:
G.degree()
Out[13]:
46
Out[14]:
46.0
2.3
2.3
Out[17]:
2.3
Out[18]:
DegreeView({0: 12, 1: 24, 2: 24, 3: 1, 4: 37, 5: 30, 6: 14, 7: 16,
8: 26, 9: 10, 10: 21, 11: 9, 12: 11, 13: 14, 14: 15, 15: 15, 16: 1
0, 17: 6, 18: 10, 19: 12, 20: 6, 21: 8, 22: 6, 23: 8, 24: 11, 25: 1
2, 26: 11, 27: 7, 28: 11, 29: 7, 30: 7, 31: 8, 32: 10, 33: 8, 34:
7, 35: 8, 36: 5, 37: 8, 38: 10, 39: 7, 40: 4, 41: 5, 42: 7, 43: 4,
44: 5, 45: 8, 46: 5, 47: 4, 48: 7, 49: 4, 50: 6, 51: 8, 52: 4, 53:
8, 54: 5, 55: 4, 56: 8, 57: 6, 58: 5, 59: 5, 60: 5, 61: 6, 62: 5, 6
3: 5, 64: 4, 65: 5, 66: 6, 67: 4, 68: 4, 69: 4, 70: 4, 71: 4, 72:
4, 73: 4, 74: 4, 75: 4, 76: 4, 77: 4, 78: 4, 79: 7, 80: 4, 81: 5, 8
2: 4, 83: 5, 84: 4, 85: 4, 86: 5, 87: 4, 88: 4, 89: 4, 90: 4, 91:
4, 92: 4, 93: 4, 94: 4, 95: 4, 96: 4, 97: 4, 98: 4, 99: 4})
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 5/28
nodes have different degreestheir distribution is another important property of a networkthe degree distribution provides probability that a randomly selected node in the network hasdegree since is a probability, it must be normalized:
for a network with nodes the degree distribution is nothing but the normalized histogram
where is the number of nodes with degree . Hence we have
Example 1
Consider the following network:
In [19]:
G1 = nx.Graph()
G1.add_edges_from([(3,4),(4,2),(3,2),(2,1)])
In [20]:
nx.draw(G1,with_labels=True)
pkk
pk
= 1∑k=1
∞
pk
N
= ,pkNk
N
Nk k= NNk pk
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 6/28
We have nodes. One node has the degree 1, thus
There are 2 nodes with the degree 2:
And we have one node with degree 3:
There are no other nodes, i.e.
Let us plot the corresponding histogram:
In [21]:
import matplotlib.pyplot as plt
degs = [1,2,3]
pvals = [0.25,0.5,0.25]
plt.stem(degs,pvals)
plt.xlabel("degree $k$")
plt.ylabel("$p_k$")
plt.title("Degree distribution")
plt.xticks(range(1,4))
Now, let us calculate the histogram on a computer:
N = 4
= = 0.25.p11
4
= = 0.5p22
4
= = 0.25.p31
4
= 0.pk>3
Out[21]:
([<matplotlib.axis.XTick at 0x7fb7fe257e80>,
<matplotlib.axis.XTick at 0x7fb8350e87b8>,
<matplotlib.axis.XTick at 0x7fb7fe220c88>],
<a list of 3 Text xticklabel objects>)
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 7/28
In [22]:
import collections
degree_sequence = sorted([d for n, d in G1.degree()], reverse=True) # degree se
quence
print(degree_sequence)
degreeCount = collections.Counter(degree_sequence)
print(degreeCount)
deg, cnt = zip(*degreeCount.items())
print(deg,cnt)
In [23]:
#the histogram
plt.stem(deg, cnt)
plt.title("Degree Histogram")
plt.ylabel("Count")
plt.xlabel("Degree")
#and the graph as inset
plt.axes([0.6, 0.6, 0.2, 0.2]) #left bottom witdh height
pos = nx.spring_layout(G1)
plt.axis('off')
nx.draw_networkx_nodes(G1, pos, node_size=20)
nx.draw_networkx_edges(G1, pos, alpha=0.4)
Example 2
[3, 2, 2, 1]
Counter({2: 2, 3: 1, 1: 1})
(3, 2, 1) (1, 2, 1)
Out[23]:
<matplotlib.collections.LineCollection at 0x7f56df005a20>
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 8/28
In [24]:
G2 = nx.watts_strogatz_graph(11,2,0)
nx.draw_circular(G2,with_labels=True)
In this case we have a ring with all nodes having the degree 2, thus
The corresponding histogram is as follows:
= {pk1,
0,
k = 2
otherwise
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 9/28
In [25]:
#count the degrees
degree_sequence = sorted([d for n, d in G2.degree()], reverse=True) # degree se
quence
degreeCount = collections.Counter(degree_sequence)
deg, cnt = zip(*degreeCount.items())
#plot the histogram
plt.stem(deg, cnt)
plt.title("Degree Histogram")
plt.ylabel("Count")
plt.xlabel("Degree")
#and the graph as inset
plt.axes([0.6, 0.6, 0.2, 0.2]) #left bottom witdh height
pos = nx.spring_layout(G2)
plt.axis('off')
nx.draw_networkx_nodes(G2, pos, node_size=20)
nx.draw_networkx_edges(G2, pos, alpha=0.4)
Please note that in this case the degree distribution is a Kronecker delta function, i.e.= δ(k − 2)pk
Out[25]:
<matplotlib.collections.LineCollection at 0x7f56df0aacc0>
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 10/28
Importance of the degree distribution
it allows to calculate many network propertiese.g. the average degree of a network may be written as
the precise functional form of impacts many network phenomena, from network robustness tothe spread of viruses
Degree distributions of real networks
node degrees can vary widely
networks (c), (d) and (f) appear to have a power-law distributions, as indicated by theirapproximately straight-line forms on the doubly logarithmic scales(b) has a power law tail, but deviates from power-law behavior for small degreespower grid network is described by an exponential distribution (note the log-linear scale)network (a) appears to have a truncated power-law degree distribution of some type or possibly twoseparate power law regimes with different exponents
⟨k⟩ = k∑k=0
∞
pk
pk
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 11/28
Adjacency matrixfor mathematical purposes networks are often represented through their adjacency matricesthe adjacency matrix of a directed network of nodes has rows and columns, its elementsbeing:
if there is a link pointing from node to node if nodes and are not connected to each other
the adjacency matrix of an undirected network is symmetric, i.e.
Example 1
Consider the following undirected network:
In [26]:
G = nx.Graph()
G.add_edges_from([(1,3),(1,2),(3,2),(2,4)])
nx.draw_spring(G,with_labels=True)
The adjacency matrix of the graph is
N N N
= 1Aij j i= 0Aij i j
=Aij Aji
A =
⎛⎝⎜⎜⎜
0
1
1
0
1
0
1
1
1
1
0
0
0
1
0
0
⎞⎠⎟⎟⎟
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 12/28
Having the adjacency matrix, we can express the degree of a node as a sum over the appropriate column orthe row of the matrix, e.g.
Since and , then
In NetworkX, once we have built a graph, we can look at its adjacency matrix with
In [27]:
A = nx.adjacency_matrix(G)
In [28]:
print(A.todense())
In [29]:
G.nodes()
In [30]:
A = nx.adjacency_matrix(G,nodelist=[1,2,3,4])
print(A.todense())
Example 2
Let us consider now a directed network, e.g.
= = = 3k2 ∑j=1
4
A2j ∑i=1
4
Ai2
=Aij Aji = 0Aii
L =1
2∑i,j=1
N
Aij
[[0 1 1 0]
[1 0 1 0]
[1 1 0 1]
[0 0 1 0]]
Out[29]:
NodeView((1, 3, 2, 4))
[[0 1 1 0]
[1 0 1 1]
[1 1 0 0]
[0 1 0 0]]
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 13/28
In [31]:
D = nx.DiGraph()
D.add_edges_from([(3,2),(1,2),(3,1),(2,4)])
nx.draw_spring(D, with_labels=True)
The corresponding adjacency matrix has the form
Again, we can use the matrix to calculate the degree of a node and the number of edges:
Important note
In some texts you may find a different convention for the adjacency matrix of a directed graph, i.e.
Comming back to our example
A =
⎛⎝⎜⎜⎜
0
1
0
0
0
0
0
1
1
1
0
0
0
0
0
0
⎞⎠⎟⎟⎟
= = 2kin2 ∑j=1
4
A2j
= = 1kout2 ∑i=1
4
Ai2
L = ∑i,j=1
N
Aij
= {A∗ij
1,
0,
if there is an edge from i to j
otherwise
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 14/28
In [32]:
nx.draw_spring(D, with_labels=True)
this alternative definition gives the following adjacency matrix:
Note that
In other words, what was the incoming degree before is now the outgoing one and vice versa:
It actually does not matter which convention is used, provided it is used consistently.
=A∗
⎛⎝⎜⎜⎜
0
0
1
0
1
0
1
0
0
0
0
0
0
1
0
0
⎞⎠⎟⎟⎟
=A∗ AT
= = 2kin2 ∑i=1
4
A∗i2
= = 1kout2 ∑j=1
4
A∗2j
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 15/28
Paths and distancesin physical systems distance plays a key role in determining the interaction between theircomponents
e.g. the distance between the Sun and the Earth determines the gravitational force thatacts between them
in case of networks distance is a challenging concepte.g. what is the distance between two webpages?
the physical distance is not relevant here (webpages could be hosted on serversbeing oh the opposite sides of the globe)
in networks the physical distance is usually replaced by the path lengtha path is a route that runs along the links of a networkthe length of a path is the number of links the path contains
the shortest path between nodes 1 and 7 is the path with the fewest number of edgesthere can be multiple paths of the same lengthbelow, the path between nodes and will be denoted as in an undirected network
in directed networks usually
i j dij
=dij dji
≠dij dji
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 16/28
Adjacency matrix and pathsthe number of shortest paths and the distance between nodes and can be calculateddirectly from the adjacency matrix
If there is a direct link between and , then
If there is a path of lenght 2 netween nodes and , then it must be
for some . Then the number of paths is given by
This is nothing but an element of .
If there is a path of lenght netween nodes and , then
As before, the number of paths of length between and is given by
these equations hold for both directed and undirected networksthe distance between nodes and is the path with the smallest for which
elegant approach that works well for networks of moderate sizesfor large networks it is more efficient to use BFS to determine the distances between nodes
Nij dij i jAij
= 1dij
i j= 1Aij
= 2dij
i j= 1AikAkj
k = 2dij
= = (N(2)ij ∑
k=1
N
AikAkj A2)ij
A2
= ddij
d i j… = 1Aik Arj
d i j
= (N(d)ij Ad)ij
i j d
> 0N(d)ij
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 17/28
Breadth-first search algorithmalgorithm for traversing or searching tree or graph data structuresit starts at the tree root (or some arbitrary node of a graph, sometimes referred to as a 'search key')and explores the neighbor nodes first, before moving to the next level neighboursBFS and its application in finding connected components of graphs were invented in 1945 byMichael Burke and Konrad Zuse, but this was not published until 1972it was reinvented in 1959 by E. F. Moore, who used it to find the shortest path out of a mazeit was discovered independently by C. Y. Lee as a wire routing algorithm (published 1961)applications:
copying garbage collection, Cheney's algorithmfinding the shortest path between two nodes u and v, with path length measured bynumber of edges(reverse) Cuthill–McKee mesh numberingFord–Fulkerson method for computing the maximum flow in a flow networkserialization/deserialization of a binary tree, allows the tree to be re-constructed in anefficient mannerconstruction of the failure function of the Aho-Corasick pattern matchertesting bipartiteness of a graph
Consider the following network:
The identification of the shortest path between nodes and goes along the following steps:
1. Start at node and label it with 0.
2. Find nodes directly linked with the node . Label the with distance 1 and put them in a queue.
i j
i
i
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 18/28
3. Take the first node, labelled , out of the queue ( in the first step). Find the unlabelled nodesadjacent to it in the graph. Label them with and put them in the queue.
4. Repeat step 3 until you find the taget node or there are no more nodes in the queue.
5. The distance between and is the label of .6. If does not have a label, the nodes belong to different components. Then
The computional complexity of BFS algorithm is
linear in both and each node needs to be entered and removed from the queue at most onceeach link has to be tested only once
In [33]:
G = nx.Graph()
n n = 1n + 1
j
i j jj
= ∞dij
O(N + L)
N L
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 19/28
In [34]:
G.add_edges_from([(1,2),(2,3),(2,4),(3,5),(4,5),(3,6),(5,6),(6,7),(7,8),(7,9)])
In [35]:
nx.draw_spring(G,with_labels=True)
In [37]:
bfs5 = nx.bfs_tree(G,5)
In [42]:
nx.draw_spring(bfs5,with_labels=True)
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 20/28
In [44]:
for e in nx.bfs_edges(G,5):
print(e)
Network diameterdiameter of the network is the maximum shortest path in the networklargest distance recorded between any pair of nodes
In [45]:
print(nx.diameter(G))
Average path lengthThe average path length is the average distance between all pairs of nodes in the network.
For a directed network of nodes it is given by:
In case of an undirected network we have:
the average path length is measured only for node pairs that are in the same componentit distinguishes an easily negotiable network from one, which is complicated and inefficient (ashorter average path length being more desirable)even if the average path length is small, the network itself might have some very remotelyconnected nodes (and many nodes, which are neighbors of each other)we can use BFS algorithm to determine for a large network:
1. Determine the distances between the first node and all other nodes with BFS.2. Determine the distances between the second node and all other nodes.3. Repeat the procedure for all remaining nodes.4. Sum the distances and divide them by the number of pairs
dmax
⟨d⟩
N
⟨d⟩ =1
N(N − 1)∑
i,j=1,Ni≠j
dij
⟨d⟩ =2
N(N − 1)∑
i,j=1,Ni≠j
dij
⟨d⟩
(5, 3)
(5, 4)
(5, 6)
(3, 2)
(6, 7)
(2, 1)
(7, 8)
(7, 9)
5
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 21/28
In [46]:
nx.average_shortest_path_length(G)
Connectednesskey utility of most networks is to ensure connectedness
a phone would be of limited use as a communication device if we could not call any validphone numberthe network behind the phone must be capable of establishing a path between any twonodes
in an undirected network nodes and are connected, if there is a path between themthey are disconnected if such path does not exist, in which case we have
In [47]:
TCN = nx.Graph()
TCN.add_edges_from([(1,2),(2,3),(1,3),(4,7),(7,6),(7,5),(5,6)])
In [48]:
nx.draw_spring(TCN,with_labels=True)
i j= ∞dij
Out[46]:
2.388888888888889
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 22/28
the above network consists of two disconnected clusters (components)within each cluster, there are paths between any two nodesthere are no paths between nodes belonging to different clustersa network is connected if all pairs of nodes in the network are connecteda network is disconnected if there is at least one pair with
for small networks visual inspection can help us to decide if they are connected or notfor networks of moderate sizes the adjacency matrix can be rearranged into a block diagonal form, ifthey are disconnected
In [49]:
A = nx.adj_matrix(TCN)
print(A.todense())
thus, tools of linear algebra may be used to determine if the adjacency matrix is block diagonalfor large networks the components are more efficiently identified using the BFS algorithm:
1. Start from a randomly chosen node and perform a BFS. Label all nodes reached this waywith .
2. If the total number of labeled nodes equals , then the network is connected. Otherwise,it consists of several components. To identify them, proceed to step 3.
3. Increase the label,
Choose an unmarked node and start BFS to find all nodes reachable from . Label themwith and return tu step 2.
In [51]:
for cc in nx.connected_components(TCN):
print(cc)
In [52]:
for cc in nx.connected_components(G):
print(cc)
In [53]:
nx.is_connected(G)
= ∞dij
in = 1
N
n → n + 1.j j
n
[[0 1 1 0 0 0 0]
[1 0 1 0 0 0 0]
[1 1 0 0 0 0 0]
[0 0 0 0 1 0 0]
[0 0 0 1 0 1 1]
[0 0 0 0 1 0 1]
[0 0 0 0 1 1 0]]
{1, 2, 3}
{4, 5, 6, 7}
{1, 2, 3, 4, 5, 6, 7, 8, 9}
Out[53]:
True
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 23/28
In [54]:
nx.is_connected(TCN)
Clustering coefficienta clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster togetherevidence suggests that in most real-world networks, and in particular social networks, nodes tend tocreate tightly knit groups characterised by a relatively high density of tiesthis likelihood tends to be greater than the average probability of a tie randomly establishedbetween two nodestwo versions of this measure exist: the global and the local:
the global version was designed to give an overall indication of the clustering in thenetworkthe local gives an indication of the embeddedness of single nodes
For a node with degree the local clustering coefficient is defined as
Here:
= the number of links between neighbors of node
- the maximum possible number of links between neighbors of node
The values of range from 0 to 1:
In [55]:
C0 = nx.Graph()
C0.add_edges_from([(1,3),(4,3),(5,3),(2,3)])
nx.draw_spring(C0,with_labels=True)
i ki
=ci2Li
( − 1)ki ki
Li i( −1)ki ki
2i
ci
Out[54]:
False
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 24/28
In this case we have for instance:
In [56]:
C1 = nx.Graph()
C1.add_edges_from([(1,3),(4,3),(5,3),(2,3),(1,4),(1,2),(2,5)])
nx.draw_spring(C1,with_labels=True)
For the above network, the local clustering coefficient of the node 3 is equal to
= = 0c30
6
= = 0.5c33
6
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 25/28
In [57]:
C2 = nx.Graph()
C2.add_edges_from([(1,3),(4,3),(5,3),(2,3),(1,4),(1,2),(2,5),(4,5),(1,5),(4,2)])
nx.draw_spring(C2,with_labels=True)
In this case we have:
Please note that in the last example the neighbors of the target node 3 are connected via a complete graph.Indeed, if we remove node 3 from the network, the resulting graph will have the maximal possible number oflinks:
In [58]:
C2.remove_node(3)
nx.draw_spring(C2,with_labels=True)
= = 1c36
6
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 26/28
measures the network's local link densitythe more densely interconnected the neighborhood of node , the higher is its local clusteringcoefficienthaving the clustering coefficients of each node, we can calculate the average clustering coefficientof the whole network:
may be interpreted as the probability that two neigbors of a randomly selected node link to eachother
In [63]:
c3 = nx.Graph()
c3.add_edges_from([(1,2),(2,3),(2,4),(2,5),(4,5),(4,6),(4,7),(5,7)])
ccs = ["0","1/6","0","1/3","1/3","0","1"]
labels={}
for i in range(1,8):
labels[i]=ccs[i-1]
nx.draw_spring(c3,labels=labels)
For the above network we have:
cii
⟨c⟩ =1
N∑i
ci
⟨c⟩
⟨c⟩ = ≃ 0.3113
42
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 27/28
In [64]:
nx.clustering(c3)
In [65]:
nx.average_clustering(c3)
the global clustering coefficient is based on triplets of nodesa triplet consists of three connected nodesa triangle includes three closed triplets, one centered on each of the nodes
In [66]:
#a triangle
tria = nx.Graph()
tria.add_edges_from([(1,2),(2,3),(3,1)])
nx.draw_spring(tria,with_labels=True)
Corresponding triplets:
cΔ
1 → 2 → 32 → 3 → 13 → 2 → 1
Out[64]:
{1: 0,
2: 0.16666666666666666,
3: 0,
4: 0.3333333333333333,
5: 0.6666666666666666,
6: 0,
7: 1.0}
Out[65]:
0.3095238095238095
28.02.2018 2_network_properties
file:///home/szwabin/Dropbox/Zajecia/Diffusion/Lectures/2_networks/2_network_properties.html 28/28
the global clustering coefficient is the number of closed triplets (or 3 x triangles) over the totalnumber of triplets (both open and closed)
the first attempt to measure it was made by Luce and Perry (1949)this measure gives an indication of the clustering in the whole network (global)it can be applied to both undirected and directed networks (often called transitivity)the roots of the global clustering coefficient go back to the social network literature of 1940s (rationof transitive triplets)
vs
if we look at the definition of the local clustering coefficient, is the number of triangles node isparticipating in, as each link between two neighbors of node closes a trianglethus, the global coefficient also captures the degree of network clusteringhowever, and are not equivalentin random networks the measures differ sligthly
the average coefficient places more weight on the low degree nodesthe global one places more weight on the high degree nodesa weighted average where each local clustering score is weighted by isidentical to the global clustering coefficient
you can find networks in which the metrics give significantly different results:
⟨c⟩ cΔ
Li ii
⟨c⟩ cΔ
( − 1)ki ki
= { ⇒ ⟨c⟩ = 1 − O(1)ci1,
,2N−1
i ≥ 3
i = 1, 2
≃cΔ1
N