dynamics of complex systems self-similar phenomena and networks
DESCRIPTION
DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks. Guido Caldarelli CNR-INFM Istituto dei Sistemi Complessi [email protected]. 5/6. STRUCTURE OF THE COURSE. SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS) GRAPH THEORY AND DATA SOCIAL AND FINANCIAL NETWORKS - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/1.jpg)
DYNAMICS OF COMPLEX SYSTEMSSelf-similar phenomena and Networks
Guido CaldarelliCNR-INFM Istituto dei Sistemi Complessi
5/6
![Page 2: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/2.jpg)
1. SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS)
2. GRAPH THEORY AND DATA
3. SOCIAL AND FINANCIAL NETWORKS
4. MODELS
5. INFORMATION TECHNOLOGY
6. BIOLOGY
•STRUCTURE OF THE COURSE
![Page 3: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/3.jpg)
•STRUCTURE OF THE FIFTH LECTURE
5.1) INTERNET
5.2) STATISTICAL PROPERTIES OF INTERNET
5.3) WORLD WIDE WEB
5.4) STATISTICAL PROPERTIES OF WWW
5.5) HITS ALGORITHM
5.6) PAGERANK
5.7) WIKIPEDIA
5.8) STATISTICAL PROPERTIES OF WIKIPEDIA
![Page 4: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/4.jpg)
![Page 5: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/5.jpg)
[email protected]> traceroute www.louvre.fr
1 141.108.1.115 Rome pcpil 2 141.108.5.4 Unknown 3 193.206.131.13 Unknown rc-infnrmi.rm.garr.net 4 193.206.134.161 Unknown rt-rc-1.rm.garr.net 5 193.206.134.17 Unknown mi-rm-1.garr.net 6 212.1.196.25 South Cambridgesh garr.it.ten-155.net 7 212.1.192.37 South Cambridgesh ch-it.ch.ten-155.net 8 212.1.194.14 Genève geneva5.ch.eqip.net 9 195.206.65.105 Genève geneva1.ch.eqip.net10 0.0.0.0 Unknown No Response11 193.251.150.30 Unknown p6.genar2.geneva.opentransit.net12 193.251.154.97 PARIS, FR p43.bagbb1.paris.opentransit.net
Previous maps have been computed through extensive collection of traceroutes
•5.1 INTERNET: Traceroute
![Page 6: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/6.jpg)
Results are that we can quantify the
hierarchical nature of the AS connections
P(A) A-2
Plot of the C(A) show the same optimisation of the Food webs
C(A) A
•5.1 INTERNET: Traceroute
![Page 7: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/7.jpg)
•Measure Forward IP Pathsskitter records each hop from a source to many destinations. by incrementing the "time to live" (TTL) of each IP packet header and recording replies from each router (or hop) leading to the destination host. •Measure Round Trip Timeskitter collects round trip time (RTT) along with path (hop) data. skitter uses ICMP echo requests as probes to a list of IP destinations. •Track Persistent Routing Changesskitter data can provide indications of low-frequency persistent routing changes. Correlations between RTT and time of day may reveal a change in either forward or reverse path routing. •Visualize Network ConnectivityBy probing the paths to many destinations IP addresses spread throughout the IPv4 address space, skitter data can be used to visualize the directed graph from a source to much of the Internet.
skitter is a tool for actively probing the Internet to analyze topology and performance.
http://www.caida.org/tools/measurements/skitter
•5.1 INTERNET: Skitter
![Page 8: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/8.jpg)
•5.1 INTERNET:The structure
![Page 9: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/9.jpg)
This happens at both domain and router server
•P(k) = probability that a node has k links
Faloutsos et al. (1999)
•5.1 INTERNET: Autonomous Systems
![Page 10: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/10.jpg)
Internet maps measurements•CAIDA•NLANR•Mercator project•IPM •Bell lab.s
•5.1 INTERNET: AS Maps
![Page 11: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/11.jpg)
Vazquez Pastor-Satorras and Vespignani PRE 65 066130 (2002)
•5.2 INTERNET: AS Statistical Properties
![Page 12: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/12.jpg)
•5.2 INTERNET: AS Statistical Properties
![Page 13: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/13.jpg)
•5.2 INTERNET: AS Statistical Properties
![Page 14: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/14.jpg)
Nodes: (static) HTML pagesEdges (directed): hyperlinks beetween pages
•5.3 WORLD WIDE WEB
![Page 15: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/15.jpg)
Why are we interested in the WebGraph?
From link analysis: • Data mining (ex: PageRank)• Sociology of content creation• Detection of communities
With a “good” WebGraph model:• Prove formal properties of algorithms• Detect peculiar region of the WebGraph• Predict evolution of new phenomena
•5.3 WWW: Introduction
![Page 16: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/16.jpg)
• Random Graph (Erdös, Renyi)
• Evolving networks (Albert, Barabasi, Jeong)
• “Copying” models (Kumar, Raghavan,…)
• ACL for massive graph (Aiello, Chung, Lu)
• Small World (Watts, Strogats)
• Fitness (Caldarelli, Capocci, De Los Rios, Munoz)
• Multi-Layer (Caldarelli, De Los Rios, Laura, Leonardi)
•5.3 WWW: Models
![Page 17: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/17.jpg)
•5.4 WWW: Statistical Properties
![Page 18: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/18.jpg)
•Bow-tie structure
•Small World for the SCC and the weakly connected components
Broder et al. , Graph structure in the web
•5.4 INTERNET: AS Statistical Properties
![Page 19: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/19.jpg)
• Explicit (or “self-aware”) communities:
1. Webrings
2. Newsgroup users
3. Gnutella, Morpheus, etc.. users• Implicit communities:
1. Fan-Center Bipartite Cores
Kumar et al. , Crawling the Web for Emerging Cyber Communities
•5.4 WWW: Statistical Properties
![Page 20: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/20.jpg)
•5.6 HITS Algorithm
A first way to search WWW has been based on a rough division of the sites
![Page 21: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/21.jpg)
•5.6 HITS Algorithm
A site i can have both an authority x and hubness y
ij
ji yx The authority is the sum of the hubness pointing in
ji
ji xy The hubness is the sum of the authorities pointed
yAx T xAy
xAAx T yAAy T
![Page 22: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/22.jpg)
•5.6 PAGERANK
Why GOOGLE works so well?
PageRank does not distinguish between different kinds of pages. Everyone has a “PageRank’’ that is shared by destination pages
The larger the out-degree the larger the reduction of the “vote”
![Page 23: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/23.jpg)
•5.6 PAGERANK
A preliminar definition of PageRank can be the following
ij
outj
ji k
PRPR )()( PRNPR
So the PageRank is defined as the eigenvector of the normal matrix
The eigenvector are computer by recursion, but THERE ARE TRAPPING STATES! These destroy convergence….
ij
outj
tjt
i k
PRPR 1
![Page 24: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/24.jpg)
•5.6 PAGERANK
The suggestion of Brin and Page was to allow every now and then a jump to a randomly selected page.
)(])1([)( PRENPR
The value of was originally taken as 0.85
![Page 25: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/25.jpg)
STATISTICAL PROPERTIES OF THE WIKIGRAPH
A.Capocci, V. Servedio, F. Colaiori,D. Donato, L.S. Buriol, S. Leonardi , GC
Centro “E. Fermi”
arXiv:physics/0602026
•5.7 WIKIPEDIA
![Page 26: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/26.jpg)
•5.7 WIKIPEDIA
![Page 27: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/27.jpg)
Wikipedia in other languagesYou may read and edit articles in many different languages:Wikipedia encyclopedia languages with over 100,000 articles
Deutsch (German) · Français (French) · Italiano (Italian) · (Japanese) · Nederlands (Dutch) · Polski (Polish) · Português (Portuguese) · Svenska (Swedish)
Wikipedia encyclopedia languages with over 10,000 articles Български (Bulgarian) · Català (Catalan) · Česky (Czech) · Dansk (Danish) · Eesti · (Arabic) العربية (Estonian) · Español (Spanish) · Esperanto · Galego (Galician) · עברית (Hebrew) · Hrvatski (Croatian) · Ido · Bahasa Indonesia (Indonesian) · 한국어 (Korean) · Lietuvių (Lithuanian) · Magyar (Hungarian) · Bahasa Melayu (Malay) · Norsk bokmål (Norwegian) · Norsk nynorsk (Norwegian) · Română (Romanian) · Русский (Russian) · Slovenčina (Slovak) · Slovenščina (Slovenian) · Српски (Serbian) · Suomi (Finnish) · Türkçe (Turkish) · Українська (Ukrainian) · 中文 (Chinese)
Wikipedia encyclopedia languages with over 1,000 articles Alemannisch (Alemannic) · Afrikaans · Aragonés (Aragonese) · Asturianu (Asturian) · Azərbaycan (Azerbaijani) · Bân-lâm-gú (Min Nan) · Беларуская (Belarusian) · Bosanski (Bosnian) · Brezhoneg (Breton) · Чăваш чěлхи (Chuvash) · Corsu (Corsican) · Cymraeg (Welsh) · Ελληνικά (Greek) · Euskara (Basque) · فارسی (Persian) · Føroyskt (Faroese) · Frysk (Western Frisian) · Gaeilge (Irish) · Gàidhlig (Scots Gaelic) · हि�न्दी� ( )Hindi · Interlingua · Íslenska (Icelandic) · Basa Jawa (Javanese) · ქართული (Georgian) · ಕನ್ನ�ಡ (Kannada) · Kurdî / كوردی (Kurdish) · Latina (Latin) · Latviešu (Latvian) · Lëtzebuergesch (Luxembourgish) · Limburgs (Limburgish) · Македонски (Macedonian) · मरा�ठी� ( )Marathi · Napulitana (Neapolitan) · Occitan · Ирон (Ossetic) · Plattdüütsch (Low Saxon) · Scots · Sicilianu (Sicilian) · Simple English · Shqip (Albanian) · Sinugboanon (Cebuano) · Srpskohrvatski/Српскохрватски (Serbo–Croatian) · தமி�ழ் (Tamil) · Tagalog · ภาษาไทย (Thai) · Tatarça (Tatar) · తెలు�గు� (Telugu) · Tiếng Việt (Vietnamese) · Walon (Walloon)
Complete list · Multilingual coordination · Start a Wikipedia in another language
•5.7 WIKIPEDIA
![Page 28: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/28.jpg)
A Nature investigation aimed to find if Wikipedia is an authoritative source of information with respect to established sources as Encyclopedia Britannica.Among 42 entries tested, the difference in accuracy was not particularly great: • the average science entry in Wikipedia contained around four inaccuracies; • the one in Britannica, about three. On the other hand the articles on Wikipedia are longer on average than those of Britannica. This accounts for a lower rate of errors in Wikipedia.
(Nature 438, 900-901; 2005)
In a survey of more than 1,000 Nature authors • 70% had heard of Wikipedia of those
• 17% of those consulted it on a weekly basis.• less than 10% help to update it
•5.7 WIKIPEDIA
![Page 29: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/29.jpg)
•5.7 WIKIPEDIA
![Page 30: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/30.jpg)
Actually, things are a little bit more complicated
•5.7 WIKIPEDIA
![Page 31: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/31.jpg)
There is not only “control” by users, but also conflict of interests. Thereby sometimes is not possible to modify 100% of the structure since some sites are locked.One of the biggest scandal was the biography of Journalist John Seigenthaler who was accused to be involved in the murder of President J.F. Kennedy
Some issues and languages have more controls than others. An experiment made by Italian newspaper “L’espresso” introduced Deliberately some errors in two voices • One in the career of Football player Rui Costa (to be part of an Italian team in the early 90’s)• To introduce a non-existing philosopher
Obviously: • The error for the football player was corrected after 30’• The philosopher remained in place until the experiment was published ( at least two weeks)
•5.7 WIKIPEDIA
![Page 32: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/32.jpg)
• sociological reasons: the encyclopedia collects pages written by a number of indipendent and eterogeneous individuals. Each of them autonomously decides about the content of the articles with the only constraint of a prefixed layout. The autonomy is a common feature of the content creation in the Web. The wikipedia authors’ community is formed by members whose only wish is to make available to the world concepts and topics that they consider meaningful. In some sense, tracing the evolution of the wikipedia subsets should mirror the develop of significant trends within each linguistic community.
• generation on time: wikipedia provides time information associated with nodes. Moreover, it provides old information: time information for the creation and the modifications for each page on the dataset.
• independency of external links: wikipedia articles link mainly to articles on the same dataset.
• variety of graph sizes: it can be collected one graph by language, and the graph dimensions vary from a few hundred pages up to half million pages.
WHY STUDYING WIKIPEDIA?
•5.7 WIKIPEDIA
![Page 33: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/33.jpg)
Summarizing:
• We have available all the history of growth, so that we can study the evolution
• We have an example of a “social” network of huge size
• We can compare the system produced by users of different language, thereby measuring the effect of different cultures.
• We can study Wikipedia as a case study for the World Wide Web
WE RECOVER A PREFERENTIAL ATTACHMENT MECHANISM FROM THE DATA.
DIFFERENT LANGUAGES PRODUCE SIMILAR STRUCTURES
WE FIND A SYSTEM SIMILAR TO THE WWW EVEN IF THE MICROSCOPIC RULE OF GROWTH IS VERY DIFFERENT.
•5.7 WIKIPEDIA
![Page 34: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/34.jpg)
The datasets of each language are available in two selfextracting files for mysql database. The table cur contains the current on-line articles, whereas the table old contains all previous versions of each current article. Old versions of an article are identified for using the same title, and not the same id. The dataset dumps are updated almost weekly, so the current graph is usually not more than a week old.
For generating a graph from the link structure of a dataset, each article is considered a node and each hyperlink between articles is a link in this graph. In the wikipedia datasets, each webpage is a single article. An article also might contain some external links that point pages outside the dataset. Usually wikipedia articles has no external links, or just a few of them. These kind of links are not considered for generating the wikigraphs, since we want to restrict the graph to pages into the set being analyzed.
•5.7 WIKIPEDIA
![Page 35: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/35.jpg)
We generated six wikigraphs, wikiEN, wikiDE, wikiFR, wikiES, wikiIT and wikiPT, generated from the English, German, French, Spanish, Italian and Portuguese datasets, respectively. The graphs were obtained from an old dump of June 13, 2004. We are not using the current data due to disk space restrictions. The English dataset of June 2005 has more than 36 GB compacted, that is about 200 GB expanded.
The page that was mostly visited was the main pages for wikiEN, wikiDE, wikiFR and wikiES, while that for the datasets wikiIT and wikiPT there were no visits associated with the pages.
•5.7 WIKIPEDIA
![Page 36: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/36.jpg)
• SCC (Strongly Connected Component) includes pages that are mutually reachable by traveling on the graph• IN component is the region from which one can reach SCC• OUT component encompasses the pages reached from SCC. • TENDRILS are pages reacheable from the IN component,and not pointing to SCC or OUT region TENDRILS also includes those pages that point to the OUT region not belonging to any of the other defined regions. • TUBES connect directly IN and OUT regions,• DISCONNECTED regions are those isolated from the rest.
The Bow-tie structure, found in the WWW (Broder et al. Comp. Net. 33, 309, 2000)
•5.7 WIKIPEDIA: Topology
![Page 37: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/37.jpg)
The percentage of the various components of the Wikigraph for the various languages.
The measure/size of the Wikigraph for the various languages.
•5.7 WIKIPEDIA: Topology
![Page 38: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/38.jpg)
in–degree(empty) and out–degree(filled) Occurrency distributions for the Wikgraph
in English (○) and Portuguese ().
The Degree shows fat tails that can be
approximated by a power-law function of
the kind P(k) ~ k-
Where the exponent isthe same both for in-degree and out-
degree.In the case of WWW
2 ≤ in ≤ 2.1
•5.7 WIKIPEDIA
![Page 39: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/39.jpg)
The average neighbors’ in–degree, computed along incoming edges, as a function of the in–degree for the English (○) and Portuguese ()
As regards the assortativity (as measured by the
average degree of the neighbours of a vertex with degree k) there is
no evidence of any assortative behaviour.
•5.7 WIKIPEDIA
![Page 40: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/40.jpg)
The pagerank distribution for wikiEN is a power law function with γ = 2.1. Previous measures in webgraphs also exhibit the same behaviour for the pagerank distribution.
We list the number of visits of the top ranked pages just to show that this value is not related with the pagerank values. We confirm that very little correlation was found between the link analysis characteristics and the actual number of visits.
•5.7 WIKIPEDIA
![Page 41: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/41.jpg)
Given the history of growth one can verify
the hypothesis of preferential
attachment. This is done by means of the histogram (k) who gives the number of
vertices (whose degree is k) acquiring new
connections at time t.This is quantity is
weighted by the factor N(t)/n(k,t)We find preferential
attachment for in and out degree.English (○) and Portuguese ().
White= in-degreeFilled = out-degree
•5.7 WIKIPEDIA
![Page 42: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/42.jpg)
In our opinion the nature of this preferential attachment is effective ratther than the real driving force in the phenomenon.
In other words the linear preferential attachment can be originated by a copying procedure (new vertices are introduced by copying old ones and keeping most of the edges). Also we could have a sort of fitness for the various entries (but in this case one has a multidimensional series of quantities describing the importance of one page).
Apart the interpretation the data show a rather clear LINEAR PREFERENTIAL ATTACHMENT
•5.7 WIKIPEDIA
![Page 43: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/43.jpg)
Other power-laws related to dyamics
need to be explainedFor example the
number of updates also follows a power law.
Each point presents the number of nodes (y axis) that were updated exactly x times.
•5.7 WIKIPEDIA
![Page 44: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/44.jpg)
This feature is time invariant
•5.7 WIKIPEDIA
![Page 45: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/45.jpg)
Actually 1) This network is oriented. 2) The preferential attachment in Wikipedia has a
somewhat different nature. Here, most of the times, the edges are added between existing vertices differently from the BA model. For instance, in the English version of Wikipedia a largely dominant fraction 0.883 of new edges is created between two existing pages, while a smaller fraction of edges points or leaves a newly added vertex (0.026 and 0.091 respectively).
From these data it seems that a model in the spirit of BA could reproduce most of the features of the system.
•5.7 WIKIPEDIA
![Page 46: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/46.jpg)
We introduced an evolution rule, similar to other models ofrewiring already considered*,
• At each time step, a vertex is added to the network. It is connected to the existing vertices by M oriented edges; the direction of each edge is drawn at random:
•with probability R1 the edge leaves the new vertex pointing to an existing one chosen with probability proportional to its in–degree;• with probability R2, the edge points to the new vertex, and the source vertex is chosen with probability proportional to its out–degree.
• Finally, with probability R3 = 1 − R1 − R2 the edge is added between existing vertices: the source vertex is chosen with probability proportional to the out–degree, while the destination vertex is chosen with probability proportional to the in–degree.
* See for example Krapivsky Rodgers and Redner PRL 86 5401 (2001)
•5.7 WIKIPEDIA: Modelling
![Page 47: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/47.jpg)
The model can be solved analytically P(kin) ~ kin
-inin1-R2))
P(kout) ~ kout
outout1-R1))
We can use for the model the empirical values of R1=0.026R2=0.091R3=0.883Already measured for the English version of Wikigraph
in out
•5.7 WIKIPEDIA: Modelling
![Page 48: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/48.jpg)
The model can be solved analytically
Knnin
(kin) ~ M N1-R1 R1R2/R3 (R3≠0)
Knnin
(kin) ~ M R1R2 ln (N) (R3=0)
Both cases is constant
The value of the constant depends also upon the initial conditions. The two lines refer to two realizations of the model where in one case the 0.5% of the first vertices has been removed.
•5.7 WIKIPEDIA: Modelling
![Page 49: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/49.jpg)
• We have a structure that resembles the bow-tie of the WWW
• We have a power-law decay for the degree distributions and also a power-law decay for the number of one page updates
• Preferential Attachment in the Rewiring seems to be the driving force in the evolution of the system
• The microscopic structure of rewiring is very different from that of WWW In principle a user can change any series of edges and add as many pages as wanted. Still most of the quantities are similar
•5.8 WIKIPEDIA: CONCLUSIONS
![Page 50: DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks](https://reader035.vdocuments.us/reader035/viewer/2022062517/5681377a550346895d9f12ea/html5/thumbnails/50.jpg)
It turns out that the pagerank of the pages is not related with the number of visit opens a very interesting scenario for further research work. Since, by definition, pagerank should give us the visit time of the page and since actually it is complety indipendent by the number of visits, we wonder if pagerank is a good measure of the authoritativeness of the pages in wikigraphs and which modifications should be introduced in order to tune its performances.
•5.8 WIKIPEDIA: CONCLUSION