introduction - beck-shop.de€¦4 the structure of complex networks “food” “network”...

21
Introduction 1 The impossibility of separating the nomenclature of a science from the science itself, is owing to this, that every branch of physical science must consist of three things; the series of facts which are the objects of the science, the ideas which represent these facts, and the words by which these ideas are expressed. Like three impressions of the same seal, the word ought to produce the idea, and the idea to be a picture of the fact. Antoine-Laurent de Lavoisier, Elements of Chemistry (1790) 1.1 What are networks? The concept of networks is very intuitive to everyone in modern society. As soon as we see this word we think of a group of interconnected items. According to the Oxford English Dictionary (2010) the word ‘network’ appeared for the first time in the English language in 1560, in the Geneva Bible, Exodus, xxvii.4: ‘And thou shalt make unto it a grate like networke of brass.’ In this case it refers to ‘a net-like arrangement of threads, wires, etc.’, though it is also recorded in 1658 as referring to reticulate patterns in animals and plants (OED, 2010). New uses for the word were introduced subsequently in 1839 for rivers and canals, in 1869 for railways, in 1883 for distributions of electrical cables, in 1914 for wireless broadcasting, and in 1986 to refer to the Internet, among many others. Currently, the OED (2010) defines a network as an ‘arrangement of intersecting horizontal and vertical lines’ or ‘a group or system of interconnected people or things’, including the following examples: a complex system of railways, roads, or other; a group of people who exchange information for professional or social purposes; a group of broad- casting stations that connect for the simultaneous broadcast of a programme; a number of interconnected computers, machines, or operations; a system of connected electrical conductors. The generality of this concept ensures that ‘network’ is among the most commonly used words in current language. For instance, a search for the word via Google (on 3 August 2010) produced 884 million items. Figure 1.1 illustrates the comparative results of a search for other frequently used words, such as ‘food,’ ‘football,’ ‘sex’, and ‘coffee’.

Upload: lediep

Post on 08-Sep-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

Introduction

1The impossibility of separating the nomenclature of a science from thescience itself, is owing to this, that every branch of physical science mustconsist of three things; the series of facts which are the objects of thescience, the ideas which represent these facts, and the words by whichthese ideas are expressed. Like three impressions of the same seal, theword ought to produce the idea, and the idea to be a picture of the fact.

Antoine-Laurent de Lavoisier,Elements of Chemistry (1790)

1.1 What are networks?The concept of networks is very intuitive to everyone in modern society.As soon as we see this word we think of a group of interconnected items.According to the Oxford English Dictionary (2010) the word ‘network’appeared for the first time in the English language in 1560, in the GenevaBible, Exodus, xxvii.4: ‘And thou shalt make unto it a grate like networkeof brass.’ In this case it refers to ‘a net-like arrangement of threads, wires,etc.’, though it is also recorded in 1658 as referring to reticulate patternsin animals and plants (OED, 2010). New uses for the word were introducedsubsequently in 1839 for rivers and canals, in 1869 for railways, in 1883 fordistributions of electrical cables, in 1914 for wireless broadcasting, and in 1986to refer to the Internet, among many others. Currently, the OED (2010) definesa network as an ‘arrangement of intersecting horizontal and vertical lines’ or‘a group or system of interconnected people or things’, including the followingexamples: a complex system of railways, roads, or other; a group of people whoexchange information for professional or social purposes; a group of broad-casting stations that connect for the simultaneous broadcast of a programme;a number of interconnected computers, machines, or operations; a system ofconnected electrical conductors. The generality of this concept ensures that‘network’ is among the most commonly used words in current language. Forinstance, a search for the word via Google (on 3 August 2010) produced 884million items. Figure 1.1 illustrates the comparative results of a search for otherfrequently used words, such as ‘food,’ ‘football,’ ‘sex’, and ‘coffee’.

4 The Structure of Complex Networks

“food” “network” “football” “sex” “coffee”0

200

400

600

800

1000

1200

1400

1600

Mill

ions

of e

ntrie

s

Fig. 1.1Relevance of the word ‘network’ incurrent language. Results of a searchvia Google (on 3 August 2010) of somepopular words in the English language.

According to the previous semantic definitions, the property of being inter-connected appears as the most salient ‘structural’ characteristic of a network.Obviously, in order to have a system of interconnections we need a group of‘entities’ and a way by which these items can be connected. For instance, ina system of railways the entities to be connected are stations, and the waythey are connected to each other is with railway lines. In general, entitiescan represent stations, individuals, computers, cities, concepts, operations,and so on, and connections can represent railway lines, roads, cables, humanrelationships, hyperlinks, and so forth. Therefore, a way of generalizing theconcept of networks in an abstract way is as follows.

A network (graph) is a diagrammatic representation of a system. It consistsof nodes (vertices), which represent the entities of the system. Pairs ofnodes are joined by links (edges), which represent a particular kind ofinterconnection between those entities.

1.2 When did the story begin?In mathematics the study of networks is known as ‘graph theory’. We will usethe words ‘network’ and ‘graph’ indistinctly in this book, with a preferencefor the first as a more intuitive and modern use of the concept. The story ofnetwork theory begins with the work published by Leonhard Euler in 1736:Solutio problematic as geometriam situs pertinentis (The Solution of a ProblemRelating to the Theory of Position) (Euler, 1736). The problem in question wasformulated as follows:

Introduction 5

In Konigsberg in Prussia, there is an island A, called the Kneiphof; theriver which surrounds it is divided into two branches, as can be seen [inFigure 1.2]. . . and these branches are crossed by seven bridges, a, b, c, d,e, f, and g. Concerning these bridges, it was asked whether anyone couldarrange a route in such a way that he would cross each bridge once and onlyonce.

A

B

C

D

c d

a b

e

g

f

Fig. 1.2The Konigsberg bridges. Schematicillustration of the bridges in Konigsberg.

Despite Euler’s solving this problem without using a picture representing thenetwork of bridges in Konigsberg, he reformulated it in a way that was theequivalent of what is now referred to as a ‘graph’. A representation derivedfrom Euler’s formulation of Konigsberg’s bridges problem is illustrated inFigure 1.3 by continuous lines which are superposed on the scheme of theislands and bridges of Figure 1.2. His paper is therefore recognised as thevery first one in which the concepts and techniques of modern graph (network)theory were used. As a consequence of this paper a new branch of mathematicswas born in 1736. For an account on the life and work of Euler in particularrespect to the Konigsberg’s bridges problem, the reader is referred to the worksof Assad (2007), Gribkovskaia et al. (2007), and Mallion (2007).

DA

B

C

c

de

g

fba

Fig. 1.3Graph of the Konigsberg bridges.Superposition of a multigraph (seefurther) over the schematic illustration ofthe bridges in Konigsberg.

The history of graph theory from 1736 to 1936 is well documented by Biggset al. (1976), and the reader is referred to this book for further details. Thereis one very important moral to be derived from Euler’s original work, whichwe would like to remark on here. What Euler demonstrated is the existenceof certain types of ‘real-world’ problems that cannot necessarily be solved byusing mathematical tools such as geometry and calculus. That is, navigatingthrough Konigsberg’s bridges without crossing any of them more than oncedepends neither on the length of the bridges nor on any of their geometricfeatures. It depends only on the way in which the bridges are interconnected—their connectivity. This historical message is important for those who eventoday deny the value of topological concepts for solving important real-worldproblems ranging from molecular branching to the position of individuals intheir social networks.

A few other historical remarks are added here to provide an idea of theinterdisciplinary nature of network/graph theory since the very early daysof its birth. The word ‘graph’ arises in a completely interdisciplinary con-text. In 1878 the English mathematician James J. Sylvester wrote a paperentitled ‘Chemistry and Algebra’, which was published in Nature (Sylvester,1877–78). Here the word ‘graph’ derived from the word used by chemists atthat time for the terminology of molecular structure. The word appeared forthe first time in the following passage:

Every invariant and covariant thus becomes expressible by a graph preciselyidentical with a Kekulean diagram or chemicograph.

Many examples of graph drawing have been collected by Kruja et al. (2002),including family trees in manuscripts of the Middle Ages, and ‘squares ofopposition’ used to depict the relations between propositions and syllogismsin teaching logic, from the same period. One very early application of net-works for representing complex relationships was the publication of FrancoisQuesnay’s Tableau Economique (Quesnay, 1758), in which a circular flow offinancial funds in an economy is represented with a network. In other areas,

6 The Structure of Complex Networks

such as the study of social (Freeman, 2004) and ecological relationships, therewere also important uses of network theory in early attempts at representingsuch complex systems. For instance, in 1912, William Dwight Pierce pub-lished a book entitled The Insect Enemies of the Cotton Boll Weevil, in whichwe find a network of the ‘enemies’ of the cotton plant which at the sametime are attacked by their own ‘enemies’, forming a directed network in anecological system (Pierce et al., 1912). Several such fields have maintaineda long tradition in the use of network theory, having today entire areas ofstudy such as chemical graph theory (Balaban, 1976; Trinajstic, 1992) socialnetworks theory (Wasserman and Faust, 1994), and the study of food webs(Dunne, 2005).

Lastly, it should be mentioned that the transition from these pioneeringworks to the modern theory of networks was very much guided and influencedby Frank Harary, who is recognised as the ‘father’ of modern graph theory. Themultidisciplinary character of his work can be considered as pioneering andinspiring because, as written in his obituary in 2005, he ‘authored/co-authoredmore than 700 scholarly papers in areas as diverse as Anthropology, Biology,Chemistry, Computer Science, Geography, Linguistics, Music, Physics,Political Science, Psychology, Social Science, and, of course Mathematics,which brought forth the usefulness of Graph Theory in scientific thought’(Ranjan, 2005). Among his other books are Graph Theory and TheoreticalPhysics (1968), which he edited, Structural Models in Anthropology (1984)and Island Networks (1996), co-authored with Hage, and his classic GraphTheory (1969).

1.3 A formal definitionAlthough the working definition of networks presented in Section 1.1 accountsfor the general concept, it fails in accounting for the different ways in whichpairs of nodes can be linked. For instance, every pair of different nodes canbe connected by a simple segment of a line. In this case we encounter thesimple network as illustrated in Figure 1.4 (top left). Links in a network canalso have directionality. In this case, a directed link begins in a given nodeand ends in another, and the corresponding network is known as a directednetwork (bottom right of the figure). In addition, more than one link can existbetween a pair of nodes, known as multi-links, and some links can join anode to itself, known as self-loops. These networks are called pseudonetworks(pseudographs) (see Figure 1.4). For more details and examples of these typesof network, the reader is referred to Harary (1969).

Finally, links can have some weights, which are real (in general positive)numbers assigned to them. In this case we have weighted networks (Barratet al., 2004; Newman, 2004b). Such weights can represent a variety of conceptsin the real world, such as strength of interactions, the flow of information,the number of contacts in a period of time, and so on. The combination ofall these types of link is possible if we consider a weighted directed networklike that illustrated in Figure 1.5 (bottom right), which is the most generalrepresentation of a network.

Introduction 7

Fig. 1.4Different kinds of network. (Top left)A simple network representing anelectronic circuit. (Top right) Amulti-network representing thefrequency of interactions in a technicalresearch group at a West Virginiauniversity made by an observer everyhalf-hour during one five-daywork-week. (Bottom left) A networkwith self-loops representing theprotein–protein interaction network ofthe herpes virus associated with Kaposisarcoma. (Bottom right) A directednetwork representing the metabolicnetwork of Helicobacter pylori(connected component only).

With these ingredients we will present some formal definitions of networks.Let us begin by considering a finite set V = {v1, v2, · · · , vn} of unspecifiedelements, and let V ⊗ V be the set of all ordered pairs [vi , v j ] of the elementsof V . A relation on the set V is any subset E ⊆ V ⊗ V . The relation E is sym-metric if [vi , v j ] ∈ E implies [v j , vi ] ∈ E , and it is reflexive if ∀v ∈ V, [v, v] ∈E . The relation E is antireflexive if [vi , v j ] ∈ E implies vi �= v j . Then wehave the following definition for simple and directed networks (Gutman andPolanski, 1987).

A simple network is the pair G = (V, E), where V is a finite set of nodesand E is a symmetric and antireflexive relation on V . In a directed networkthe relation E is non-symmetric.

The previous definition does not allow the presence of multiple links and self-loops. Consequently, we introduce the following more general definition ofnetwork.

A network is the triple G = (V, E, f ), where V is a finite set of nodes,E ⊆ V ⊗ V = {e1, e2, · · · , em} is a set of links, and f is a mapping whichassociates some elements of E to a pair of elements of V , such as thatif vi ∈ V and v j ∈ V , then f : ep → [vi , v j ] and f : eq → [v j , vi ]. Aweighted network is defined by replacing the set of links E by a set oflink weights W = {w1, w2, · · · , wm}, such that wi ∈ �. Then, a weightednetwork is defined by G = (V, W, f ).

8 The Structure of Complex Networks

GRE IRN

TUN

KOR

JPN

PRT

BRA

PRY

BGRCMR

TUR

NOR

AUT

SCO

COL

ITA

NGA

MEXARG

CHL

USA

CHE

JAM

BEL

DNK

HRV

ROM

MARFRA

DEU

YUGESP

NLD

ZAFGBR

Fig. 1.5Weighted directed network. Representation of the transfer of football players between countries after the 1998 World Cup in France. Linksare drawn, with thickness proportional to the number of players transferred between two countries.

For instance, let V = {v1, v2, v3}, E = {e1, e2, e3, e4, e5, e6}, andf : e1 → [v1, v1], f : e2 → [v3, v3], f : e3 → [v1, v3], f : e4 → [v2, v1],f : e5 → [v2, v3], f : e6 → [v3, v2], then we have the network illustrated inFig. 1.3 (left). If we consider the following mapping:

f : 1.5 → [v1, v1]f : 0.9 → [v3, v3]f : 0.6 → [v1, v3]f : 2.1 → [v2, v1]f : 0.1 → [v2, v3]f : 1.2 → [v3, v2]

we obtain the network illustrated in Figure 1.6 (right).

Introduction 9

1

3 32 2

1.5

2.1

1

0.6

0.91.2

0.1

Fig. 1.6Pseudonetwork and weightednetwork. The networks produced byusing the general definition of networkspresented in the text.

1.4 Why do we study their structure?Complex networks are the skeletons of complex systems. Complex systemsare composed of interconnected parts which display soe properties that are notobvious from the properties of the individual parts. One of the characteristics ofthe structure of these systems is their networked character, which justifies theuse of networks in their representation in a natural way. In their essay ‘Modelsof structure’, Cottrell and Pettifor (2000) have written:

There are three main frontiers of science today. First, the science of the verylarge, i.e. cosmology, the study of the universe. Second, the science of thevery small, the elementary particles of matter. Third, and by far the largest,is the science of the very complex, which includes chemistry, condensed-matter physics, materials science, and principles of engineering throughgeology, biology, and perhaps even psychology and the social and economicsciences.

Then, the natural place to look for networks is in this vast region ranging fromchemistry to socioeconomical sciences.

As we have seen in the historical account of network theory in this chapter,chemists have been well aware of the importance of the structure of networksrepresenting molecules as a way to understand and predict their properties. Toprovide an idea we illustrate two polycyclic aromatic compounds in Figure 1.7.The first represents the molecule of benzo[a]pyrene, which is present intobacco smoke and constitutes its major mutagen with metabolites which areboth mutagenic and highly carcinogenic. By replacing the ring denoted as E inbenzo[a]pyrene to the place below rings A and B, a new molecule is obtained.This polycyclic aromatic compound is called perylene, and it is known to benon-carcinogenic.

These two isomeric molecules are represented by networks of only 20 nodes,which reduces the ‘complexity’ of their analysis. However, this number canbe increased to thousands when biological macromolecules, such as proteins,are represented by networks. The lesson that can be learned from this singleexample is that the structure of a network can determine many, if not all, of theproperties of the complex system represented by it. It is believed that networktheory can help in many important areas of molecular sciences, including therational design of new therapeutic drugs (Csermely et al., 2005).

10 The Structure of Complex Networks

A B C

D E

A B C

D

E

Fig. 1.7Molecular isomeric networks. Networkrepresentation of two polycyclicaromatic compounds. The firstcorresponds to benzo[a]pyrene—ahighly carcinogenic compound found intobacco smoke. The second correspondsto perylene, which is a non-carcinogeniccompound.

According to the Oxford English Dictionary (2010), ‘structure’ is defined as‘the arrangement of and relations between the parts or elements of somethingcomplex.’ Similarly, the Cambridge Advanced Learner’s Dictionary (Walterand Woodford, 2005) defines it as ‘the way in which the parts of a system orobject are arranged or organized, or a system arranged in this way.’ The aim ofour book, therefore, is to study the way in which nodes and links are arrangedor organized in a network—though initially this appears to be a very easy taskwhich probably does not merit an entire volume. In order to obtain a betteridea about our subject, let us take a look at the networks which are depictedin Figure 1.8, which have 34 nodes. The structure of the first one is easilydescribed: every pair of nodes is connected to each other. With this informationwe can uniquely describe this network, and can obtain precise informationabout all its properties. In order to describe the second network we have to saythat all nodes are connected to two other nodes, except two nodes which areconnected to only one node. This also uniquely describes this network, and wecan know all its properties. The structure of the third network is more difficultto describe. In this case we can say that every node is connected to three othernodes. However, there are 453,090,162,062,723 networks with 34 nodes thatshare this property. Despite of this we can say a lot about this network with theinformation used to describe it. That is, there are many ‘structural’ propertieswhich are shared by all networks that have the property of having every nodeconnected to the other three. In a similar way, we can refer to the fourth andfifth networks. The fourth one has 34 nodes and 33 links—a type of structureknown as a tree, which has many organizational properties in common. Thefifth network has 34 nodes and 78 links, and it has been generated by a randomprocess in which nodes are linked with certain probability. If we define sucha process—the number of nodes and the linking probability—we can generatenetworks which asymptotically share many structural properties. We will studythese types of network and some of their properties in the various chapters ofthis book.

Now let us try to describe, in a simple way, the last network, which repre-sents the social interactions between individuals in a real-world system. It has34 nodes and 78 links, but it does not appear to be as random as the previousone. It is not a tree-like, nor does it display regularity in the number of linksper nodes, and it is far removed from the structures represented by the firsttwo networks in Figure 1.8. Then, in order to describe the ‘structure’ of this

Introduction 11

Complete Path Cubic

Random tree Random Real-world

Fig. 1.8Networks and complexity. Several types of network which can be ‘defined’ by using different levels of complexity.

network we need to add more and more information. For instance, we can saythat the network:

(i) has exponential degree distribution and is strongly disassortative (seeChapter 2);

(ii) has small average path length (see Chapter 3);(iii) has a high clustering coefficient (see Chapter 4);(iv) has two nodes with high degree, closeness, and betweenness centrality

(see Chapter 7);(v) has spectral scaling with positive and negative deviations (see Chap-

ter 9);(vi) has two main communities (see Chapter 10);

(vii) displays small bipartivity (see Chapter 11), and so on.

The list of properties for this network can be considerably extended by usingseveral of the parameters we will describe in this book, and others that are

12 The Structure of Complex Networks

Fig. 1.9Road network. The international roadnetwork in Europe.(http://en.wikipedia.org/wiki/Image:International E Road Network.pgn.)

Fig. 1.10Protein residue network. Thethree-dimensional structure of a protein,showing its secondary structure (left).Residue interaction network built byconsidering Cα atom of each amino acidas a node of the network. Two nodes areconnected if the corresponding Cα atomsare separated by at least a certaindistance—in this case, 7 A.

not covered here. However, this list of properties is probably not sufficient toaccount for the entire structure of this network. This situation is similar tothat described by Edgar Allan Poe when he wrote in relation to the structureof a strange ship in his short story entitled MS. Found in a Bottle, written in1833 (Poe, 1984): ‘What she is not, I can easily perceive—what she is I fearit is impossible to say.’ The description of the structure of networks is notimpossible, but it requires a great deal of information to make such descriptionas complete as required. In other words, the structure of some networks is ingeneral complex, understanding complexity as the amount of information thatwe need to precisely describe this structure.

A typical example is based on the ‘nature’ of the entities of the com-plex systems represented by these networks. In this case we can talk aboutbiological networks, socioeconomic networks, technological and infrastruc-tural networks, and so forth. Every group of these networks can be further

Introduction 13

Fig. 1.11World Wide Web. Image ofWWW-Gnutella from CAIDA.(Obtained, with permission, fromhttp://www.caida.org/tools/visualization/plankton/Images/.)

subdivided into several others. For instance, biological networks can be subdi-vided into interspecies networks such as food webs, intercellular networks suchas neural or vascular networks; intermolecular networks such as metabolicor protein–protein interaction networks, and interatomic networks, such asprotein residue networks. This classification system reveals the nature of thenodes, but not that of the interrelation between the entities of the complexsystems represented by them. Another possibility of classifying complex net-works is according to the nature of the links which they represent. Some ofthe possible classes of network in this scheme are presented below, with someexamples.

• Physical linking (Fig. 1.9). Pairs of nodes are physically connected by atangible link, such as a cable, a road, a vein, and so on. Examples are theInternet, urban street networks, road networks, vascular networks, and so on.

• Physical interactions (Fig. 1.10). Links between pairs of nodes representsinteractions which are determined by a physical force. Examples are proteinresidue networks, protein–protein interaction networks, and so on.

• ‘Ethereal’ connections (Fig. 1.11). Links between pairs of nodes are intan-gible, such that information sent from one node is received at another,irrespective of the ‘physical’ trajectory. Examples are the World Wide Weband airport networks.

14 The Structure of Complex Networks

Distance

Subgraph

Closure

Diameter

Degree

Walk

Trail Bridge

PathConnected

Cycle

Forest

Spanning Tree

Tree

Acyclic Graph

Height

Child

Ordered TreeRooted Tree

Level

m-ary Tree

OffspringReduced Graph

Chromatic Number

Binary Search TreeDecision Tree

Adjacency Matrix

Strongly Connected

Spanning SubgraphHomeomorphio

Bipartite Graph

GraphCondensed Graph

Loopk-Colorable

Arc

Isomorphic

Edge

Topological Order

PlanarSize

Lable

Orientation

Tournament

Neighborhood

Arc List

Digraph

Vertex

Adjacent

Clique

Regular

Complete

Leaf

Pendant Vertex

Internal Vertex

Adjacency Structure

Node

Order

Hamiltonian

Connected Component

Fig. 1.12Glossary network. The network formed by some of the terms commonly used in graph/network theory. Nodes represent concepts in graphtheory, and a link is directed from one concept to another if the second appears in the definition of the first.

• Geographic closeness. Nodes represent regions of a surface and their con-nections are determined by their geographic proximity. Examples are coun-tries on a map, landscape networks, and so on.

• Mass/energy exchange. Links connecting pairs of nodes indicate that someenergy or mass has been transferred from one node to another. Examplesare reaction networks, metabolic networks, food webs, trade networks, andso on.

• Social connections. Links represent any kind of social relationship betweennodes. Examples are friendship, collaboration, and so on.

• Conceptual linking (Fig. 1.12). Links indicate conceptual relationshipsbetween pairs of nodes. Examples are dictionaries, citation networks, andso on.

1.5 How can we speak their ‘language’?The language of networks is the language of graph theory. We will define manyof the terms used in graph and network theory in each chapter of this book, buthere it is useful to present some of the most commonly used terms (Essam andFisher, 1970; Harary, 1969; Newman, 2010).

Adjacency and incidence. Two nodes u and v are adjacent if they are joinedby a link e = {u, v}. Nodes u and v are incident with the link e, and the link e

Introduction 15

LAMBERTES

GUADAGNI

BISCHERI

PERUZZI

STROZZI

CASTELLAN

GINORIBARBADORI

ALBIZZI

RIDOLFITORNABUON

MEDICI

SALVIATI

ACCIAIUOLPAZZI

Fig. 1.13Node degrees. Representation of thenode degrees proportional to the radii ofthe circles of each node. The networkillustrated corresponds to the marriagerelations between pairs of Florentinefamilies in fifteenth century (Breiger andPattison, 1986).

BILL

DON

HARRY

MICHAELHOLLY

LEE

BERT

STEVE

RUSS

GERY

JOHN

CAROL

ANN

PAM

PAT

JENNIE

BRAZEYPAULINE

BILL

DON

MICHAEL

HOLLY

PAT

PAM JENNIE

CAROL

ANN

GERY

JOHN

RUSS

STEVE

BERT

LEE

PAULINE

HARRY

BRAZEY

Fig. 1.14Node degrees in a directed network. Representation of indegree (left) and outdegree (right) proportional to the radii of the nodes for adirected network of eighteen participants in a qualitative methods class. Ties signify that the ego indicated that the nominated alter was one ofthe three people with which s/he spent the most time during the seminar.

is incident with the nodes u and v. Two links e1 = {u, v} and e1 = {v, w} areadjacent if they are both incident with at least one node. Node degree is thenumber of links which are incident with a given node (see Fig. 1.13).

Adjacency and incidence (directed). Node u is adjacent to node v if there isa directed link from u to v e = (u, v).1 A link from u to v is incident from uand incident to v; u is incident to e, and v is incident from e. The indegree ofa node is the number of links incident to it, and its outdegree is the number oflinks incident from it (see Fig. 1.14).

1 It is generally assumed that two nodes in a directed network are adjacent if they are connectedin any direction by a link. This definition, however, conflicts with the definition of adjacency matrixthat we will study in Chapter 2.

16 The Structure of Complex Networks

MEDICI LAMBERTES

GUADAGNI

GINORI

CASTELLAN

BISCHERI

BARBADORI

ALBIZZIACCIAIUOL

TORNABUON

STROZZI

SALVIATI

RIDOLFI

PERUZZI

PAZZI

LAMBERTES BISCHERI

STROZZI

PERUZZI

CASTELLAN

ALBIZZI

GINORI

BARBADORI

ACCIAUOL

MEDICI

SALVIATI

PAZZI

RIDOLFI

TORNABUON

GUADAGNI

JENNIE HOLLY

HARRY

GERY

DON

CAROL

BRAZEY

BILL

BERTANN

LEE

JOHN

STEVE

RUSS

PAULINE

PAT

PAM

MICHAEL

BILL

DON

MICHAEL

HOLLY

PAT

PAM

JENNIE

CAROL

ANN

GERY

JOHNRUSS

STEVE

BERT

LEE

PAULINE

HARRY

BRAZEY

Fig. 1.15Isomorphic networks. Different representations of the networks of Florentine families in the fifteenth century (top), and that of participants ina qualitative methods class (bottom).

GINORI

PAZZI

SALVIATI

PERUZZI

GUADAGNI

BARBADORI

LEE

BRAZEY

BERT

STEVE

RUSS

JOHN

GERY

RIDOLFI

STROZZI

BISCHERI

CASTELLAN

Fig. 1.16Subgraphs in networks. Subgraphs found in the networks of Florentine families in the fifteenth century (left), and that of participants in aqualitative methods class (right).

Isomorphism (Fig. 1.15). Two networks G1 and G2 are isomorphic if thereis a one-to-one correspondence between the nodes of G1 and those of G2, suchas the number of links joining each pair of nodes in G1 is equal to that joiningthe corresponding pair of nodes in G2. If the networks are directed, the linksmust coincide not only in number but also in direction.

Introduction 17

Fig. 1.17Cliques in networks. Representation ofa five-clique found in a networkrepresenting inmates in prison.

GINORIACCIAIUOL

PAZZI

SALVIATI

MEDICI

RIDOLFI

CASTELLAN

PERUZZI

ALBIZZI

GUADAGNI

LAMBERTES

BISCHERI

BARBADORI

STROZZI

TORNABUON

LEEGERY

HOLLY

PAT

JENNIE

ANN

CAROL

DON

BILL

BERT

RUSSBRAZEY

STEVE

JOHN PAULINE

PAM

MICHAEL

HARRY

Fig. 1.18Paths in networks. Paths in the networks of Florentine families in the fifteenth century (left), and that of participants in a qualitative methodsclass (right).

Subgraph (Fig. 1.16). S = (V ′, E ′) is a subgraph of a network G = (V, E)

if and only if V ′ ⊆ V and E ′ ⊆ E .Clique (Fig. 1.17). A clique is a maximal complete subgraph of a network.

The clique number is the size of the largest clique in a network.Walk. A (directed) walk of length l is any sequence of (not necessarily

different) nodes v1, v2, · · · , vl , vl+1 such that for each i = 1, 2, · · · , l there is a

18 The Structure of Complex Networks

GINORI

ACCIAIUOL

PAZZI

SALVIATI

MEDICI

RIDOLFI

PERUZZI

CASTELLAN

ALBIZZI

GUADAGNI

LAMBERTESBISCHERI

BARBADORI

STROZZI

TORNABUON

GERY

HOLLY

DON

BILL

PAT

ANN

CAROL

PAULINE

RUSS

BERT

STEVE

LEE

JOHN

PAM

JENNIE

BRAZEY

HARRY

MICHAEL

Fig. 1.19Cycles in networks. Cycles in the networks of Florentine families in the fifteenth century (left), and that of participants in a qualitativemethods class (right).

GERY

MICHAEL

HOLLY

DON

BILL

PAT

ANN

CAROL

PAULINE

RUSS

BERT

STEVE

LEE

JOHNBRAZEY

PAM

HARRY

JENNIE JENNIE

GERY

HOLLY

DON

BILL

PAT

ANN

CAROL

RUSS

BERT

STEVE

LEE

JOHN

PAMBRAZEY

PAULINE

MICHAEL

HARRY

Fig. 1.20Underlying graph of a directed network. Directed network of participants in a qualitative methods class (left) and its underlying graph(right).

link from vi to vi+1. This walk is referred to as a walk from v1 to vl+1. A closedwalk (CW) of length l is a walk v1, v2, · · · , vl , vl+1 in which vl+1 = v1.

Path (Fig. 1.18). A path of length l is a walk of length l in which all thenodes (and all the links) are distinct. A trial has all the links different, but notnecessarily all the nodes.

Cycle (Fig. 1.19). A cycle is closed walk in which all the links and all thenodes (except the first and last) are distinct.

Underlying network (Fig. 1.20). This is the network obtained from a directednetwork by removing the directionality of all links.

Connected components (Fig. 1.21). A (directed) network is connected ifthere is a path between each pair of nodes in the (underlying) network. Other-wise it is disconnected. Every connected subgraph is a connected componentof the network.

Connectivity. The node connectivity of a connected network is the smallestnumber of nodes whose removal makes the network disconnected. The edge

Introduction 19

Fig. 1.21Components in networks. The twentyconnected components existing in theprotein–protein interaction (PPI)network of E. coli.

LAMBERTESBISCHERI

PERUZZI

STROZZI

ALBIZZI

RIDOLFITORNABUON

MEDICI

ACCIAIUOL

Node cutset Link cutset

SALVIATI

PAZZI

BARABADORIGINORI

GUADAGNI

CASTELLAN

LAMBERTESBISCHERI

PERUZZI

STROZZI

ALBIZZI

RIDOLFITORNABUON

MEDICI

ACCIAIUOL

SALVIATI

PAZZI

BARABADORIGINORI

GUADAGNI

CASTELLAN

Fig. 1.22Cutsets in networks. A node cutset (left) and a link cutset (right) in the network representing Florentine families in the fifteenth century.

20 The Structure of Complex Networks

BILL

DON

HARRY

MICHAEL

HOLLY

PAT

JENNIE

ANN

CAROL

PAULINEJOHNRUSS

GERY

STEVE

LEE

BRAZEY

BERT

PAM

Fig. 1.23Strongly connected components in networks. Four strongly connected components in the network of participants in a qualitative methodsclass. One of the strongly connected components has ten nodes and are represented by squares, another has five nodes and is represented bycircles, and two others have only one node each and are represented by triangles.

Fig. 1.24Trees. A random tree (left) and a Cayley tree, each with 500 nodes.

Introduction 21

LAMBERTES

BISCHERI

PERUZZI

STROZZI

ALBIZZI

RIDOLFITORNABUON

MEDICI

ACCIAIUOL

SALVIATI

PAZZI

BARABADORI

GINORI

GUADAGNI

CASTELLAN

Fig. 1.25Spanning trees in networks. Aspanning tree in the networkrepresenting Florentine families in thefifteenth century.

connectivity of a connected network is the smallest number of links whoseremoval makes the network disconnected.

Cutsets (Fig. 1.22). A node cutset of a connected network is a set of nodessuch as the removal of all of them disconnects the network. A link cutset ofa connected network is a set of links such that the removal of all of themdisconnects the network.

Strongly connected component (Fig. 1.23). A directed network is stronglyconnected if there is a path between each pair of nodes. The strongly con-nected components of a directed network are its maximal strongly connectedsubgraphs.

Tree (Fig. 1.24). A tree of n nodes is a network for which the followingstatements are equivalent:

LAMBERTES

BISCHERI

PERUZZI

STROZZI

ALBIZZI

RIDOLFI

TORNABUON

MEDICI

ACCIAIUOL

network complement

SALVIATI

PAZZI

BARABADORI

GINORI

GUADAGNI

CASTELLAN

LAMBERTES

BISCHERI

PERUZZI

STROZZI

ALBIZZI

RIDOLFI

TORNABUON

MEDICI

ACCIAIUOL

SALVIATI

PAZZI

BARABADORI

GINORI

GUADAGNI

CASTELLAN

Fig. 1.26Network complement. Network representing Florentine families in the fifteenth century (left), and its complement (right).

22 The Structure of Complex Networks

Fig. 1.27Bipartite networks. A bipartite networkrepresenting heterosexual relationshipamong individuals. Individuals of eachsex are represented as squares or circles,respectively.

Fig. 1.28Planar networks. A planar networkrepresenting the network of galleriesmade by ants.

• It is connected and has no cycles.• It has n − 1 links and no cycle.• It is connected and has n − 1 links.• It is connected and becomes disconnected by removing any link.• Any pair of nodes is connected by exactly one path.• It has no cycles, but the addition of any new link creates a cycle.

Spanning tree (Fig. 1.25). A spanning tree of a network is a subgraph of thisnetwork that includes every node and is a tree.

Introduction 23

Complement of a network (Fig. 1.26). The complement of a network G,denoted G, is a network created by taking all nodes of G and joining twonodes whenever they are not linked in G (see Figure 1.25).

Regular network. An r -regular network is a network in which all nodes havedegree r and it has rn/2 links.

Complete networks. A network with n nodes is complete, denoted Kn , ifevery pair of nodes is connected by a link. That is, there are n(n − 1)/2 links.

Empty networks. A network that has no link. It is denoted as K n , as is thecomplement of the complete network.

Cycle networks. A regular network of degree 2; that is, a 2-regular network,denoted by Cn .

Network colouring. A k-colouring of a simple network is an assignment ofat most k colours to the nodes of the network, such that adjacent nodes arecoloured with different colours. The network is then said to be k-colourable.The chromatic number is the smallest number k for which the network isk-colourable.

Bipartite networks (Fig. 1.27). A network is bipartite if its nodes can be splitinto two disjoint (non-empty) subsets V1 ⊂ V (V1 �= φ) and V2 ⊂ V (V2 �= φ)and V1 ∪ V2 = V , such that each link joins a node in V1 and a node in V2.Bipartite networks do not contain cycles of odd length. If all nodes in V1 areconnected to all nodes in V2 the network is known as complete bipartite and isdenoted Kn1,n2 , where n1 = |V1| and n2 = |V2| are the number of nodes in V1and V2, respectively. A network is bipartite if it is 2-colourable.

k−partite networks. A network is k-partite if its nodes can be split into kdisjoint (non-empty) subsets V1 ∪ V2 ∪ · · · ∪ Vk = V , such that each link joinsa node in two different subsets. If all nodes in a subset are connected to allnodes in the other subsets, the network is known as complete k-partite and isdenoted Kn1,n2,··· ,nk , where ni = |Vi | is the number of nodes in Vi .

Planar networks (Fig. 1.28). A network is planar if it can be drawn in a planein such a way that no two links intersect except at a node with which they areboth incident.