2/41

2

ContentsEditorial The Seed and the Environment .............................................................................................. 3

Protocols and power laws ....................................................................................................................... 6

Abstract ............................................................................................................................................... 6

Introduction ......................................................................................................................................... 6

Distributed and decentralized protocol .............................................................................................. 7

User participation and power structures .......................................................................................... 11

Sources .............................................................................................................................................. 12

Utilizing the rules: The rule-based paradigm as a complement to connectionism ............................... 14

1 Introduction .................................................................................................................................... 14

2 Creating or developing rules .......................................................................................................... 17

3 Hybrid models for the win .............................................................................................................. 19

Bibliography ....................................................................................................................................... 20

The Uniform Resource Identifier Revis(it)ed ......................................................................................... 22

Dream Machines Defined v1.0 .......................................................................................................... 22

Web Research Defined v1.1 .............................................................................................................. 23

Linked Data Defined v2.0 .................................................................................................................. 24

Software Actors Defined v2.1 ............................................................................................................ 24

Semantic Web Defined v2.2 .............................................................................................................. 25

URL to URI 3.0 .................................................................................................................................... 26

Bibliography ....................................................................................................................................... 27

Here goes nothing how control vanished into everywhere. .............................................................. 29

1. Introduction ................................................................................................................................... 29

2. Metaphors ..................................................................................................................................... 303. The loss of aura and rise of the hyperreal ..................................................................................... 31

4. Metaphors as tools for control ...................................................................................................... 32

6. Conclusion ..................................................................................................................................... 34

REFERENCES: ..................................................................................................................................... 34

Media Ownership by Gillian Doyle ........................................................................................................ 36

Review of Six Degrees: the Science of a Connected Age by Duncan J. Watts (2003) 356 pp. .............. 38

Book review: Jean Baudrillard Simulacra and Simulation (1994) ....................................................... 40


3/41

3

Editorial - The Seed and the Environmentby Tom van de Wetering

Tim Berners-Lee, inventor of the World Wide Web, was clear in a recent TED key-note. He shouted

"Raw Data Now!" and left his public an interesting thought: the future will bring a web consisting of

linked data instead of linked documents.

Simply put, this event was a successor in a long tradition of predictions and dreams on file structures

such as the Web. Back in 1945, Vannevar Bush tried to conceptualize something like a dream

machine, which was further elaborated by Ted Nelsons dream file (1964). Not long after its

invention in 1990, the Web was seen by many as the realization of previous dreams, but a next step,

sometimes called Semantic Web and sometimes Web2.0, evoked dreams and predictions like a

Vulcan eruption. So why is just this keynote important, and not one of the many others?

First, it is because Sir Tim Berners-Lee expressed his dreams. Unless there were many other factorsinvolved during the emergence of the Web, it was Berners-Lee who invented the important HTML,

HTTP and URI protocols. Still, he remains responsible for the development of original and new Web

protocols, like XML, as director of the World Wide Web Consortium (W3C). Galloway (2006) already

showed that protocols control in a way the structure of contemporary networks. In that sense,

Berners-Lee is a powerful actor in the construction of a control machine. This special issue of

Network Theory deals with the problem of control. Not on the frequently elaborated level of political

control, social control or capital control, but on the level of network control. How is the structure of

the Web controlled, which actors are involved, how are structures changed and, last but not least,

how do we need to study it?

In the first article, Coen de Goey claims that common perceptions of control need to be reevaluated.

Distributed network structures like the Web are still controlled by a small set of institutions. The

structure of Web2.0 is thought of to bring the user in control, but instead, De Goey argues, the user

created together an instrument that took others in control.

The second contribution disperses fluently from Berners-Lees keynote too. To deal with the complex

system the inventor is arguing about and to translate technical details to a broad audience, Berners-

Lee uses a lot of metaphors. The important thing is that some of them seem pretty new, like raw

data. Michiel Stoter claims that metaphors are, and have always been, important actors who are

involved with the structure of the Web. Metaphors ubiquitously present in the debate about

Web2.0, Semantic Web and are we aware that the terms web and link are metaphors too?

Third, the concept of linked data Berners-Lee proposes, indicates an interesting claim on structural

Web changes. In the third article, I will show how seemingly little changes of Web standards, which

are often misinterpreted, facilitate practices that are known to change the Web and far beyond, like

Web2.0. I will introduce the field of Software Studies, that can help Network Science and other fields

to better understand and define the Web objects we all encounter.

In the fourth article, Jeroen Knitel will further elaborate that methodological problem. He states thatthe connectionism approach to study complex systems delivered interesting results, but needs to


4/41

4

integrate some fuzzy logic if we really want to understand what that results mean. Knitel argues

that tiny rule-based systems are collaboratively in control of generated network effects. A

methodology consisting of both connectionist and rule-based methods is a welcome addition.

The seed and the environment

The articles in this special issue are sometimes closely related to studying networks. Knitel is directlyconcerned with the methodology of network scientists and De Goey studies how control exists

through distributed networks. The other two articles are less easy linked to Network Theory. While

Stoters article pays a lot attention to computer icons, my own study contains a lot of descriptions of

how software works. Why does Network Theory need such alternatives to the social, technological

and socio-technological models and maps we are used to? An example from the field of biology can

help us to answer that question.

The crystal is expression. Expression moves from the mirror to the seed. It is the same circuit which

passes through three figures, the actual and the virtual, the limpid and the opaque, the seed and the

environment.(Gilles Deleuze)

Gilles Deleuze is a philosopher, not a biologist. Though he often claims things using biological

entities, like in this case the seed. He uses biological entities as metaphors to sustain his claims about

something else. The same is true for the rhizome. As a metaphor it inspired many contemporary

philosophers and artists. We can compare many complex systems with a rhizome. In the eyes of

Deleuzians, many things, for example the economy, art and even "reality" are based on a shared set

of principles, for example: "any point of a rhizome can be connected to anything other, and must be"

(1980: 7) and "a rhizome may be broken [] but it will start up again on one of the old lines, or on

new lines" (p. 10). Network Theory scholars often relate the network they encounter to the conceptof the rhizome. Often, the original Web is called a rhizome too, based on that key principles. Just like

a biological structure, a Web of linked documents is easy to conceptualize and to match with such a

metaphor. Indeed, there is no beginning and no end and indeed, cutting of a web site does not

destroy the whole Web. Deleuze's metaphorical concepts are compatible to a broad field of topics. In

this sense, philosophical metaphors are the standardized protocol to develop studies as an extra

layer on top. What is needed is an adaptation of a philosophical metaphor and start analyzing the

actual object. What is exactly the "seed" in the case of the Web? What is the "environment"? How

does the seed interact with the environment?

The problem is that it becomes a lot more difficult when a definition of real objects, in this case real

rhizomes, is necessary. Biological structures keep growing and change into many different species.

After a long period of evolution, there is a lot of brilliant analytical hindsight needed to come to an

"origin of species". While Darwin's theory offers this brilliant hindsight, contemporary research

confirms that the evolution theory is still a theory, which can be debated and further improved.

Simon Conway Morris (2003) demonstrated that the evolution of species is to a great extend

influenced by the environment species live in. Using an exhaustive resource of real biological

examples, the biologist claims different species, with very different DNA, but living in the same

environment, are convergent: they begin to look like each other without being genetically related.

In that example, the "environment" is taken as a fact, the "seed" is defined as a real biological

organism and the result is a better understanding of the structure of evolution.


5/41

5

Now look at the image at the cover of this issue. What we see is the fourth element of Berners-Lees

keynote: a graphic to illustrate the concept of linked data. The graphic can tell a lot. It relates the

concept of linked data to the concept of the rhizome. It makes clear that underlying data structures

are for Websites what rhizomes are for beautiful flowers. But it also tells us there is an

environment. That websites look different in a different environment, just like flowers do.

To take the comparison to a next level: De Goey investigates who controls the garden and how some

flowers control the behavior of other flowers. Stoter agrees with Conway-Morris when he states that

flowers are visually convergent without mixed DNA. I claim that not only flowers, but also the

rhizomes are influenced by their environment. Knitel claims that it is necessary to both analyze the

number of flowers, organized by species, as it is necessary to understand how seeds grow.

As Berners-Lee et al. describe in detail, the main point of interest for the construction of linked data

will be "the adoption of common conceptualizations referred to as ontologies" (2006: 96). Without

this "semantic standardization", linkages between databases organized various formats, is difficult orreturns confusing results. Interestingly, being both Web scholars and scholars as such, all writers in

this issue both reflect on the development of Web ontologies and the development of scientific

ontologies. What is needed for the field of Web studies and other fields related to the Web is a

common understanding of the Web's ontology, which is compatible as an assumption in many fields,

and which needs, like every other ontology, to be continuously evaluated. At the same moment,

designing a better way to document and share that ontology, in the form of standardized linked

(research) data, could lead to the results Berners-Lee promised us. However, attempts in the form of

well-written scientific publications are not rejected, to continue the inefficiently beautiful tradition of

the Humanities. I am sure this form, the 50th issue of the Journal of Network Theory, is an valuable

attempt to both evaluate and design both Web- and research ontologies.

Bibliography

Berners-Lee, T. (2009) Linked Data. TED Conference 2009.

Conway-Morris, S. (2003). Lifes Solution. Cambridge: Cambridge University Press

Deleuze, G. Guattari, F. (1980)A Thousands Plateaus. London: Continuum

Galloway, Alexander R. (2004). Protocol: How control exists after decentralization. Cambridge: MIT

Press.


6/41

6

Protocols and power lawsBy Coen de Goeij

Abstract

In this article, I discuss how protocols and power structures can serve as a bottleneck for the

distributed character of the Web. Firstly, the DNS structure is decentral but also managed by an

independently chosen and various staff. This structure functions as a bottleneck if only the physical

structure is considered, the composition however consists of independent parties which makes it

distributed. Secondly, the distributed TCP/IP structure is despite its structure of autonomous nodes,

subject to political and commercial interest, the restrictions of Internet access in China forms an

example that underlines this.

The Achilles heel of the Internet is not a singular clearly distinguishable aspect. The model that shows

the structure best, might be found in the scale free network. The power law in this model is most

obvious visible in the early structure of Web 1.0 when it was about product sales. In Web 2.0 it is stillscale free, this time it is not about products but platforms that provide services with a more active

role for the user. Whether the Web has become better distributed by Web 2.0 can be answered if it

is considered which format provides the nodes in the Web with the most autonomy. As long as

protocol, political limitations or platform standards manipulate the behavior of the user, it forms a

threat to the distribution and accessibility of data streams. However, since Web 2.0 has proposed to

step back in order to let go control, the user has become a step further towards being an

autonomous node in the Web.

Introduction

"Welcome to the democratic web: internet of the people, by the people, for the people" (Stibbe2006).This is a sentence derived from a blog mirroring the overall tendency about Web 2.0. The

network of the Internet - and especially the current developments around Web 2.0 is seen by many

as the key to a democratic system that would provide open access and a minimum of interrogation

from unwanted parties. This vision is among others legitimated by the notion of the Internet as a non

governed system, "rhizomatic and lacking central organization" (Galloway 2004, 8). Castells also

mentions the sentiment from the start of the Internet: "Created as a medium for freedom, in the first

years of its worldwide existence the Internet seemed to foreshadow a new age of liberty" (Castells

2001, 168). The original development of the Internet began with the scientific ARPAnet and was

oriented on packet switching over a distributed network. Packet switching was used in order to

secure the transfer if one part would drop out (Leiner 2003). This was possible by using other

computers as hosts, serving as network nodes. Licklider, head of the initiating organization DARPA,

was the first who introduced this concept as the "Galactic Network" (ibidem). His idea was "a globally

interconnected set of computers through which everyone could quickly access data and programs

from any site" (ibidem, Origins of the Internet). This involves a network without the classical power

structures such as the purely central or decentral structures (Galloway 2004, 27). While the

distributed network freed itself from this power structure, it did not become free from control as

Galloway points out. The new control mechanisms lies in the protocol that is used in every form of

online data transfer: "In order to initiate communication, the two nodes must speak the same

language. This is why protocol is important" (ibidem, 12).


7/41

7

This overall protocol is visible in both the distributed TCP/IP protocol and the decentralized DNS

structure. The TCP/IP protocol was designed to create a standardized language for data transfer. It is

called distributed because the individual hosts function as autonomous nodes based on this protocol.

The DNS system was adopted in order to simplify the process of referring to an address. The DNS

structure represents an hierarchical order because the listing of addresses takes place on a single

location (ibidem, 8-9). The combination of DNS and TCP/IP creates a new form of control that is

hence ambivalent in nature but also complementary because both are protocols within the same

system.

I will discuss these forms of control and how these are established by politics and legislation. In order

to do this, I will also focus on the function of commercial interests that is in a way interrelated with

protocological power. I will make this parallel using Barabsi's scale free network (2003). Barabsi

introduces the scale free network as a network that has a minority of nodes that have far more links

than average (Barabsi 2003, 52). The major growth of some of these hubs brings in different

interests and more importantly, a disproportionate power share. This is why Barabsi calls the scalefree network "robust against accidental failures but vulnerable to coordinated attacks" (ibidem, 52). I

will propose a similarity with the hubs that are visible in the form of big internet companies such as

Google. Much of the services of these companies are known as Web 2.0, meaning in short that the

user becomes central in the process of content supply while the website supplies the platform for

these activities (O'Reilly 2005, 2). This model could also be seen as a distributed system, for a

multiplicity of users instead of a single producer are responsible for the providence of content. What

is more, powerful Web 2.0 services could not even exist without the small start initiated by a few

users. This brings me to a new paradox: Web 2.0 exist by the very participation of its users and has

now taken control over these users by the central position it has gained. To name a few: Google,

MySpace, Flickr and Wikipedia. These services are comparable to the hubs Barabsi mentioned in his

model of the scale free network. I will argue how these companies become powerful through the act

of the apparent release of control over content, however not necessarily improving the distributed

and plural scope of the Internet.

Distributed and decentralized protocol

In the origin of the Internet there was a need for a general form of protocol in order to communicate

and develop according to the same standards. This began with the data transmission protocol TCP/IP

in 1980, build for the purpose of cheap ubiquitous connectivity (Galloway 2004, 6). Also, protocols

refer specifically to standards governing the implementation of specific technologies (ibidem, 7).

Different institutions are responsible for the development of protocols such as the W3C1 which is

responsible for protocols like HTML and CSS. The standards are published in RFC documents and

fulfill the role of reference for developers, so that different nodes in the network function according

to the same rules: the requirements spelled out in this document are designed for a full-function

Internet host, capable of full interoperation over an arbitrary Internet path (Braden 1989, 1.1.3).

The protocols are divided in four layers that organize one part of the protocol: the application layer,

which is the top layer, the transport layer, the Internet layer and finally the link layer (ibidem). These

layers have a nested structure, meaning that each layer depends on the above layer: ultimately the

entire bundle (the primary data object encapsulated within each successive protocol) is transported

according to the rules of the only privileged protocol, that of the physical media itself (Galloway

1World Wide Web Consortium


8/41

8

2004, 11). Despite the arbitrary organizing of transmission standards, the control within the TCP/IP

system is distributed over autonomous nodes in the network. There is no need for a central hub to

direct the communication, therefore it is a distributed network.

A different but complementary form of protocol lies in the DNS, a large decentralized database that

maps network addresses to network names (ibidem, 8,9). This system works like an inverted treestructure and is hierarchical and decentral in nature. The top level are the root servers, containing

the addresses of the underlying servers. This is a decentral network, because every request goes via

one of the dozen root servers (ibidem, 9) to the lower servers with addresses of specific domains. In

the past there has been discussion over the place this DNS system must have and how it should be

governed. The DNS was first managed by one of the founders Jon Postel but after his death, the DNS

management was handed over from the US government to the non-profit organization ICANN2

in

1998 (Terranova 2004, 45). The discussion involved if and how specific US or Western interests

should be the measure for the formation of internet policy. This is demonstrated by the following

quote, derived from the Green Paper, a proposal document for democratization of the DNS

governing that was published by the U.S. government:

the U.S. government recognizes that its unique role in the Internet domain name system

should end as soon as is practical. We also recognize an obligation to end this involvement in

a responsible manner that preserves the stability of the Internet. We cannot cede authority

to any particular commercial interest or any specific coalition of interest groups. We also

have a responsibility to oppose any efforts to fragment the Internet, as this would destroy

one of the key factors - interoperability - that has made the Internet so successful (NTIA

1998, Registrars, The Process).

ICANN was founded after this proposal as a response to the call for a more democratic and robust

structure of the Internet. It is composed of a collection of staff members and is connected to diverse

advisory organs (see figure 1 for a complete overview). However, the ICANN still has been subject of

discussion about among others the role of the U.S. as leading state in this organization. One side,

says the U.S., should stay in charge to maintain the power and right of free speech on the Internet.

But other says the U.S. and its insistence on English use and its preference for a strictly Western way

of looking at the world unfairly imposes its cultural values on others (Nolan, in: Eweek.com 2005-11-

17).

Another aspect of concern has been the intertwining of interests and its power towards the

economic market. Recently there has been posed questions by the European Parliament about the

possible infringement by ICANN of the free trade market. This was following an agreement between

ICANN and the name registry organ VeriSign about the rights on domain name sales:

Trade in registration services for generic domain names (e.g. those ending in .com, .net or

.org) is controlled by an incorporated private industry body in the United States (ICANN)

which also operates an office in Belgium.

ICANN sets minimum wholesale prices for domain name registration and awards the right to

run Internet generic domain registries which offer services within the Single Market as well

2Internet Corporation for Assigned Names and Numbers


9/41

9

as globally. ICANN collects a levy on every generic domain name that is registered worldwide,

including all such registrations for companies, organisations and consumers throughout the

Member States and the European Economic Area.

ICANN has recently entered into a number of arrangements with other undertakings,

including one with the largest domain name registry company (Verisign), to which ICANN hasgranted the exclusive right to the .com and .net name registries in return for a levy on the

end to the consumer, which bears no relation to the cost of providing the service, and which

levy has resulted in price hikes to European consumers.

Has the Commission received any complaints from European citizens or businesses, and, in

any event, will the Commission investigate whether the arrangements between ICANN,

Verisign, and European domain name registrar companies are subject to Art 81 and/or Art.

82 of the Treaties ? (Dunn 2007, Question 78).

Although ICANN claims its role is not concerned with content ICANN's role is very limited, and it isnot responsible for many issues associated with the Internet, such as financial transactions, Internet

content control, spam (unsolicited commercial email), Internet gambling, or data protection and

privacy (ICANN 2008, FAQs) it has a lot of control because it has a central position in the defining

and management of domains. To prevent the rise of power structures within ICANN from happening

there was formed a diverse staff of people who circulate after given periods, and are nominated by a

separate committee, as is visible in Figure 1. An advisory committee from the U.N. was also formed

after a summit called World Summit on the Information Society addressing among others bridging

the digital divide: We reaffirm the commitments made in Geneva and build on them in Tunis by

focusing on financial mechanisms for bridging the digital divide, on Internet governance and related

issues, as well as on implementation and follow-up of the Geneva and Tunis decisions (WSIS 2005,

Tunis agenda for the information society). Ironic is the fact that in the prior debate, the U.S. wanted

to keep control over ICANN for the sake of freedom and democracy the same reason the EU had for

forming an international control organ (van der Wal 2005,EC: Amerikaans DNS-beheer kan leiden tot

breuk internet).


10/41

10

Figure 1 Organizational structure of ICANN (source: www.icann.org)

An example of unwanted regulation of access to Internet resources can be found in China. Because of

the strict control of the media, it is not possible to access the same material within the borders of

China as elsewhere. This regulation is exercised by the worlds most sophisticated network formonitoring and limiting information online (McMahon 2008, U.S. Internet Providers and the Great

Firewall of China ). Problematic in this case is the relation between free market processes and an

open Internet climate. According to the same Council on Foreign Relations, several U.S. companies

are providing China with services and materials that allows the Chinese authorities to monitor and

restrict free access and publication on the Internet: China relied on two U.S. companiesCisco

Systems and Juniper Networksto help carry out its network upgrade, known as CN2, in 2004. This

upgrade significantly increased China's ability to monitor Internet usage. Cisco is also due to provide

China with routers designed to handle Internet attacks by viruses and worms but equally capable of

blocking sensitive content (ibidem). Also Yahoo and Google are mentioned in the same article as

parties that helped the authorities in their attempts to censor online publications.

The debate according to McMahon is between the ideas of establishing a presence in the country

and a total boycott of business with China by means of a law that forbid U.S. Internet companies

from locating their content servers inside China or other nations seen as human rights abusers

(ibidem). This case is exemplary for the way state power can interrogate with the open character of

the distributed web. State censorship is contributed to by any unevenness in the distribution of the

network, caused by power structures within the network. Interesting in this case is the comparison to

the scale free network (Barabsi 2003). The scale free network is, as mentioned in the introduction,

considered to be remarkably resistant to accidental failures but extremely vulnerable to

coordinated attacks (ibidem, 52). When in this network, important hubs like Google or Yahoo decide

to cut access to their resources, it immediately affects the whole network.


11/41

11

User participation and power structures

Previously I talked about how Internet has a new form of control that is managed by protocols. The

TCP/IP protocol is the distributed form of control and the DNS protocol functions as a decentral

network. As I showed the decentral structure of the DNS is not simply an autonomous power that is

directing policy upon the user. The distributed structure of TCP/IP is also more than just an open

access network free of power structures. In this chapter I want to address the role of companies in

the structure of Web 2.0.

Web 2.0 is often misunderstood as being solely a platform, this is according to Tim OReilly also the

case with services who do not fit in the profile of Web 2.0. An example OReilly mentions is that of

Netscape, which offered an application that could be installed in order to have access to content and

applications within the browser (OReilly 2005, Ch.2). This platform was created in order to use their

dominance in the browser market to establish a market for high-priced server products (ibidem).

The Web 1.0 era was, so to say, all about selling products and creating a leading market position in

software licensing and control over APIs3 (ibidem, Ch.1). This principle differs from Web 2.0

because it offers a product instead of a service. What typifies Web 2.0 is that it provides a platform

of services that becomes more valuable, the more users it has: Web 2.0 companies set inclusive

defaults for aggregating user data and building value as a side-effect of ordinary use of the

application. [...] they build systems that get better the more people use them (ibidem, Ch.2). This

system creates indeed a power structure, but this time the power is in the gaining of data. The

competition has moved from sales towards data collection(ibidem).

The profit in this model is according to OReilly made by means of selling services with customers

paying, directly or indirectly, for the use of that service (ibidem, Ch.1). Later in the article OReilly

entitles successful Web 2.0 companies as companies that give up something expensive but

considered critical to get something valuable for free that was once expensive. For example,

Wikipedia gives up central editorial control in return for speed and breadth (ibidem, Ch.5). The

release of this control is in a way simply replaced by another means of control, reaching to even

more aspects of the users life. Because of the adaptation to the specific wishes of every user, classic

boundaries between areas like home and work are blurring. Sociologist and policy analyst William

Davies writes in his column for the website The Register how this process does more than only

providing its users the services needed:

In short, efficiency gains are no longer being sought only in economic realms such as retail

or public services, but are now being pursued in parts of our everyday lives where previously

they hadn't even been imagined. Web 2.0 promises to offer us ways ofimproving the

processes by which we find new music, new friends, or new civic causes. The hassle of

undesirable content or people is easier to cut out. We have become consumers of our own

social and cultural lives. [...] Web 2.0 takes the efficiency-enhancing capabilities of digital

technology and pushes them into areas of society previously untouched by efficiency

criteria (Davies 2007, 07-31st

).

The collection of user data as mentioned earlier and the penetration of the daily life of the user can

be seen as the downside of Web 2.0 However it is also interesting to look at what Web 2.0 did to

3Application Programming Interface: a set of protocols defined for the building of applications.


12/41

12

contribute to the distribution of the Net. Web 1.0 was in its structure already considered to be

relatively robust because each entity was considered to be autonomous (Galloway 2004, 8). Web 2.0

goes even further by making the user producer of the data instead of consumer (OReilly 2005, Ch.1).

This process causes an increase in the number of nodes in the network. It creates a network with

more nodes that makes a better distribution of data streams possible.

According to Barabsi, the Internet is in structure a free scale network with a few dominating hubs

like Google and Yahoo (Barabsi 2003, p53). This power law is present in Web 1.0 as well as Web 2.0;

the big corporations act as hubs. Some things are necessary for the network to function properly,

power structures could be seen as a necessary evil. The power law is a symbol for in fact any central

form of organization. In the case of the Internet, the big hubs facilitate a standard and create more

possibilities that wouldnt be possible without the central position of the hub:

what makes the web an efficient medium for information exchange: it is a network

containing centralised nodes where they are necessary, and not where they aren't. Google is

one of the centralised nodes. It works well because it is a monopoly, not in spite of being

one (Davies 2007, 03-29th

).

As pointed out before, Web 2.0 crosses associative boundaries that creates a new, pervasive way of

linking between nodes in the network. Besides that, users have become producers instead of

consumers. To return to the central issue; what does this mean for the distribution of the network?

As mentioned, the companies behind Web 2.0 have become less visible. It is to say it is less directly

about selling and more about providing the a platform of data collection. This collection of data is the

price for a free platform of services.

This issue of media ownership is also addressed by Film and Media scholar Gillian Doyle. Shementions the ambivalence in the concentration of media ownership: the fact that expansion gives

rise to efficiency gains provides a compelling public interest case in favour of media ownership

policies which encourage rather than curb such growth strategies (Doyle 2002, 37-38). Apart from

this matter lies the question how media ownership in a situation of distinct producers and consumers

differs from the Web 2.0 scenario in which this boundary is blurred. The outcome of the Web 2.0

development is not clear. What is clear, is that the individual user has become central in the

production of data and the offer of different products. To control this platform is synonym for

cooperation and integration within other services (OReilly 2005, Ch.7). When this focus on the

specific needs of the user stays central, for whatever intentions it may be, the network has an

important instrument in achieving the goal of independent access to data. Last but not least, the user

has become an active and conscious node in the Web, that will be aware of its own value when hubs

fall apart.

Sources

Barabsi, A. and Bonabeau, E. 'Scale-Free Networks'. Scientific American 288, 2003

Braden, R. RFC 1122, Requirements for Internet Hosts, Communication Layers, October 1989

http://tools.ietf.org/html/rfc1122 (visited 3-4-2009)


13/41

13

Davies, W. Ask.com's bogus Information Revolution. The Register.com. 27-03-2007 (visited 6-3-

2009)

http://www.theregister.co.uk/2007/05/29/william_davies_ask_vs_google/page2.html

Davies, W. The cold, cold heart of Web 2.0. The Register.com. 31-07-2009 (visited 6-3-2009)

http://www.theregister.co.uk/2007/07/31/william_davies_web20/

Doyle, G. Media Ownership, Sage Publications, London: 2002

Dunn, B.N. Website: European Parliament. Question no 78 by Bill Newton Dunn. Subject: ICANN's levy

from price increases imposed on Europeans. Strassbourg: 15-03-2007 (visited 4-4-2009)

http://www.europarl.europa.eu/sides/getDoc.do?type=QT&reference=H-2007-0126&language=EN

Internet Corporation for Assigned Names and Numbers, website, 2008

www.icann.org (visited 4-4-2009)

Leiner, B.M. et al.A Brief History of the Internet, version 3.32. Internet SocietyLast revised 10 Dec 2003 http://www.isoc.org/internet/history/brief.shtml (visited 2-4-2009)

McMahon, R. U.S. Internet Providers and the Great Firewall of China. Council on Foreign

Relations. 15-02-2008 (visited 5-4-2009)

http://www.cfr.org/publication/9856/

Nolan, C. ICANN Controversy Is Just the Beginning. Eweek.com, 17-11-2005 (visited 5-4-2009)

http://www.eweek.com/c/a/Government-IT/ICANN-Controversy-Is-Just-the-Beginning/

NTIA (Green Paper) 1998, Registrars, The Process

http://www.ntia.doc.gov/ntiahome/domainname/dnsdrft.htm

O'Reilly, T. 'What Is Web 2.0: Design patterns and business models for the next generation of

software.' Tim.oreilly.com September 2005

http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

Wal, van der H. EC: Amerikaans DNS-beheer kan leiden tot breuk internet. Tweakers.net. 14-10-

2005. (visited 5-4-2009)

http://tweakers.net/nieuws/39389/ec-amerikaans-dns-beheer-kan-leiden-tot-breuk-internet.html

WSIS. Tunis agenda for the information society. Tunis: 18-11-2005

http://www.itu.int/wsis/docs2/tunis/off/6rev1.html


14/41

14

Utilizing the rules: The rule-based paradigm as a complement to

connectionismRotating the iceberg 90 degrees, melting it a bit and then let it freeze again

by Jeroen Knitel

Abstract

In this article it becomes clear why we do not have to make a clear-cut divide between

connectionism models and rule-based alternatives to research complex systems.

Instead, this article stresses the importance of merging these models to make them more powerful in

a complex system; not only in terms of analysis and description but also in terms of reproduction and

representation. Starting by introducing both models and its properties the article continues by

looking at the complexity of a system like the economy and the system of language.Illustrating the

relation, the essential link, with proven research the need for an hybrid use of mentioned models

becomes clear by utilizing both in its most efficient and advantageous way.

1 Introduction

1.1 A complex crisis

The contemporary economic crisis is a hot topic. From macro-economic levels, governments and

financial institutions down to the micro level, the average household, you and me, everyone is

involved. But who could have foreseen the economic state we are in right now? Who understands

the economic system to such a degree that this all could have been forecasted? Certain people claim

that they knew it was coming way before the rest did, but how did they already know? To

understand and predict certain behavior of a complex system like the economy we need models

(Silvert, 2000). Models that explain, show and to a certain degree describe how a system works and

is going to work. These models and their modeling are the driving force of science (ibidem) and

therefore important for researching something so complex like the economic system. Paul Cilliers

(1998) argues that the economic system consists of individual agents clustered together to form the

larger-scale phenomena (Cilliers, 1998:7) that form the complexity. But how do these agents relate

to each other in the system? What does this network of actors look like and how does it work? To

understand this we have to relate the complex system to a known model. Roughly said, there are

two models to distinguish in understanding complex systems: rule-based symbol systems andconnectionist models. Both models have strong support by scientists and their theories and both

have a proven history (Chomsky 1968, Cilliers 1998). What distinguishes those two and what are both

main points of critique?

1.2 Sets of rules and sets of neurons

Beginning in the late 50s of the 20th century scientists argued about the reproducibility of the

human mind with the help of the newly discovered computer techniques. Artificial Intelligence (AI)

was an emerging field and it was the computer that gave a somewhat satisfying solution to the

discussion of the separation of body and mind (Haugeland, 1985). Because the only computers

available at the time were in the form of Turing machines scientists started to program them to


15/41

15

emulate tasks derived from the human mind. This resulted in computational models; models based

on sets of predefined rules and symbols. The usage of such systems varied, they were used as expert

systems that could play chess (Cilliers, 1998: 14) or even model a language (Chomsky, 1968).

Although programmers could tell such systems what the rules are of a language, actually using a

language turned out to be more complex.

The main characteristics of rule-based systems can be broken down to the following (Serra & Zanarini

1990 in Cilliers, 1998: 14):

+ Rule-based systems model complex systems on an abstract level

+ Rules in the system represent concepts directly

+ The sets of rules are centrally controlled by the meta-rules of the system

+ Each symbol is a local representation of a specific concept

+ The structure of the system is a prioridefined

Carrying on by these given characteristics, we can say that rule-based systems are bound by the rules

that are predefined with it. Furthermore, they are a prioristructured by these rules and are therefore

seen as inflexible (Churchland, 1986). Moreover rule-based systems dont have an original

intentionality or a so called meaning: as said it operates on the abstract level of syntax instead of

semantics as interpreted by humans (Searle, 1980). Therefore the system can handle symbolic

information but has little to say about how symbolic behavior emerges.

The disadvantages of the rule-based system accounted for describing complex tasks performed by

the human mind like using a language. Fascinated by the complexity of the human mind scientists

tried to map a model of the human brain. Sigmund Freud (1950) took a lead in this field and it was

until the early 80s that neural models were worked out to an extent that we can call them the multi

layered connectionism models as we know them today (Cilliers, 1998: 16). But what do we know

about them today?

A connectionist model, or (artificial) neural network, consists of a collection of nodes that operate

like neurons (as in the human brain). The nodes of this network are interconnected via synaptic

connections, meaning that every node is connected to at least one other node. When a node is active

it sends out a signal (fires) to his connected nodes. Depending on the weight of the network of one of

the nodes it can activate others in its network that it is connected to depending on the number of

signals it receives itself. Therefore, in relation to the human brain, it is possible to describe any state

of the network as numerical activation values of each of the nodes existing in the network.

The strength of connectionist models lies in the fact that it can be trained by giving it the state of a

problem as an input and the desired outcome as an output. When you provide the network with

enough input patterns and a correct output it evolves in the direction of a solution (Cilliers, 1998:

28). The network does so by adjusting the weight of the nodes in the training process so that it

generates the desired output. This behavior is referred to as Hebbian, derived from the work of

Donald Hebb (1949): the strength of a synapse [node] increases as it participates in firing a neuron

(Scott, 1995: 81). By doing so the network can guess the output of a similar but different input. The


16/41

16

accurateness or correctness of the output depends on the training the network had and the diversity

of the input compared to the one it was trained with. Complex behavior emerges from the

interaction between many simple processors that respond in a non-linear fashion to local

information (Cilliers, 1998: 18).

Compared to rule-based systems we can say that the main characteristics of connectionism modelsare as follows:

+ They operate on the level of neurons and their weights

+ Neurons have no predefined meaning and are soft constrained

+ Degrades gracefully

+ They can recognize patterns and train itself

+ There is no predetermined centralized control

+ Every node in the network changes dynamically

+ It is self-organized and therefore dynamically structured

Also connectionism models have its weaknesses, one of the indisputable claims are that rules derived

from these connectionist models arent always readable for humans. When that is the case they

require a deep semantic analysis of what is actually learned (Ledezma et al, 2001).

1.3 Introducing economectionism

When we return to the complex system of the economy with knowledge of both models how could it

be related to something so low level as neurons and weights? For that we need to turn to economist

Friedrich Hayek (1982) who argues that we can use the metaphor of a connectionism model to

understand the complex system of economy:

[T]he mind, from the perspective of The Sensory Order, turns out to be a dynamic, relational

affair that is in many respects analogous to a market process. The mind is a 'continuous

stream of impulses, the significance of each and every contribution of which is determined by

the place in the pattern of channels through which they flow', in such a way that the flow of

representative neural impulses can be compared 'to a stock of capital being nourished by

inputs and giving a continuous stream of outputs

(Hayek 1982, p. 291 in Smith, 1992:np).

In the next chapter we will look more closely at the connection between the economy and a

sophisticated model like connectionism. But more importantly for this introduction is the question

raised by connecting the economic system to a connectionist model: by what extent can we

understand the economy when we look at it on a level that consists of only neurons and their

weights? We are inevitably missing a comprehensive approach consisting of definitions, symbols and

rules. We are only looking at the tip of the iceberg and not only in the case of economics. Therefore,

what is important for this article is the following question: how do rule-based models remain usefulwhen researching the operation of complex networks?


17/41

17

To answer this question I will draw out two different fields where we can see a transformation from

rule- and symbol-based systems towards connectionism models. I will point out that we need to

merge these models, instead of only looking at one of them, to make them more powerful in

complex systems; not only in terms of analysis and description but also in terms of reproduction and

representation.

2 Creating or developing rules

2.1 Marketplace of neurons

When we look at a complex system like the economy it is not a clear-cut case of rather you would see

or research it in a connectionists perspective or that of one based on rules. Rather I would argue

that we can see both kinds of models in play when researching and that we also need both of them

to understand the complexity behind it. For the connectionists approach of the economic system we

can distinguish two different approaches. The first one consists of understanding the economy as a

connectionist model. The second one uses this model to research it and subsequently generate a set

of rules that can be used to predict behavior of the system.

Hayek (1982) argues that the economy can be seen as a neural network. He says that we can only

have some kind of qualitative understanding of the economy and cannot exactly predict it, thus it can

not be made rational or subject to control. In his article The Use of Knowledge in Society (1945)

Hayek describes the market as in the mind where essential information gets passed in the form of

signals, in the following example looking at the prices as nerve impulses:

In abbreviated form [] only the most essential information is passed on and passed on only to

those concerned. It is more than a metaphor to describe the price system as a kind of machinery for

registering change [] in order to adjust their activities to changes of which they may never knowmore than is reflected in the price movement (Hayek, 1945: 525).

What we can see when comparing a complex system of economy with a neural network of the mind

is that both the mind as the economic system evolved (trained) throughout a massive number of

trials and errors. But there are more similarities to point out like how all market participants are

interconnected in a way that they can activate each other and have a certain weight (richness) in the

system. It is a bit exaggerating to say the least that we can feed the economy problems and let it

process it throughout all its nodes to come up with the same output. In a way it does function like

described above. But is this a way that only consists of learned rules and defined symbols by the

system, meaning they came forth out of the connectionism, or can we point to more factors in theeconomic process?

In their book Neural Networks for Economic and Financial Modelling (1996) Andrea Beltratti et al

state that the economy, as an evolving complex system with agents who continually learn and adapt,

can be forecasted when all these agents are implemented in an artificial neural network (ANN).

Crescenzio Gallo et al (2006) state the same by saying that ANN can model economic agents and

show behavioral rules derived from simple initial requirements that evolve towards complex

simulated environments (Gallo et al: 4). Noticeable in their research is that both works talk about

behavioral rules of the economic system that they need to feed to the system as a form of input. In

this way they activate the learning of the network by choosing architecture and parametersnecessary for the definition of the connection weights between the neurons in the network (ibidem).


18/41

18

Both teams of scientists found out that there are a number of constraints when performing and

building an ANN like they mention in their work. Ironic enough it comes down to the two factors

what everything else in economy is about: time and money. The time needed for developing a

certain set of rules out of massive amounts of input data of only parts of the economic system is

difficult to estimate and depends on the amount of computational power available (money). Robert

Marks (1996) who reviewed the work of Beltratti et al (1996) mentions this not a critique of ANN per

se: What they [ANN] are not is deductive, and so they cannot provide necessary conditions [] all

simulation of its nature can only exemplify sufficiency, not necessity (Marks, 1996: np).

2.2 Fuzzy logic

What is the missing opportunity here, how paradoxically it may sound, is that the two researches

mentioned are bounding themselves to the infinite possibilities of using connectionism models. What

is important is that by holding on to the notion of learning rules and sufficiency instead of necessity

andpredefining rules is that the complexities they want to research becomes a complex and ever

during task itself. It is therefore that computer scientists like Zhou et al (2001) moved towards amodel that combines best of both the connectionism and parts of the rule-based world into a model

what they call neural fuzzy systems (NFS): With neural fuzzy systems unique capabilities of dealing

with both linguistic information as numerical data, the linguistic and numerical heterogeneous data

can be translated into an initial structure and parameters (Zhou et al, 2001: 468). The NFS can

improve the learning rate of neural networks tremendously by incorporating fuzzy logic about a

system. It is called fuzzy logic because these rules are a best known approximation of the possible

rules, but by utilizing the capabilities of the learning qualities of the connectionism model these

approximations can be shaped to a developed set of rules of the complex system. NFS is therefore

perceived as being a hybrid intelligent system (Bonissone et al, 1996).

Returning to Hayeks notion of understanding economy as a connectionism model we can also see a

way of incorporating rules. Not in a way of that we use fuzzy logic on the complex system but instead

by looking at defined rules that are fact instead of approximations. For example when looking at the

Amsterdam stock exchange (AEX): the AEX is an index based on 25 Dutch companies that all have

their own numerical value (weight) in the index. Although they operate in a way that can be linked to

a neural network by defining the stocks as neurons, they also are subject to a set of, what I call,

supervening known rules. For example: a maximum of 25 companies that are being selected by

calculating the turnover on the stock market of the top 25 stocks sold every third Friday of February.

Such a set of rules sets literally the boundaries of a given concept within the economic system and

builds on the comprehensibility of the subject.

What I pointed out in the above examples of the economic system is that we cannot exclude rule-

based systems to research a complex system like the economy. To get a firm grip on a complex

system of the economy we need to define certain actors and their operating fields which they are

bound to together with its connectionist properties. Having introduced the hybrid possibilities of the

models I continue showing how this also is a positive account on the field of language.

2.3 Understanding the system of language

Since the introduction of the connectionism models the system of language is part of an ongoingdebate between the connectionists and the supporters of the classical rule-based system (Sas, 2002).


19/41

19

On the side of rule-based approaches and computational models we find Jerry Fodor (1975) who

argues that there is a language of the mind that functions as a machine language where all

computations find place that relate to a human cognitive intelligence and therefore also the

language in which he communicaties. The defenders of the rule-based paradigm state that there is

linguistic or propositional knowledge in a sort of stored memory that is associated with a set of

explicit rules for the cognitive processing. Connectionists on the other hand argue that there is no

such thing as a linguistic representation (as in a language of the mind) needed for the human mind to

operate: Language is a social art, and linguistic behavior serves a communicative function [...]

Language is principally a device for communication (Churchland, 1986 in Sas, 2002: 221).

Knowledge of a language and gaining a language is part of the same process say connectionists: it is

necessary to reject the opposition between competence as a matter of knowing explicit rules and

performance as a matter of the application of such rules (Smith, 1992: np). There is no such thing as

a stored memory, only an ongoing process of adapting neurons that reflect the current state of

knowledge. By hearing words (input) we can develop a set of words, a vocabulary, by ourselves. From

there we can construct correct sentences by hearing them over and over again. Daniel Dennett

(1991) argues that not only our linguistic environment takes care of enough input, also our outputs

can be used as new inputs. This autodidact behavior is wat Dennett calls a verbal autosimulation

where we use our own neural network to stimulate itself (ibidem). That said, how would we define

the acquisition of all grammatical rules or the vocabulary of a second language? These are explicitly

told and learned. It is like a direct input of rules instead of learning them through a number of inputs.

Of course one could argue that these rules are fed the same way as any other input into the neural

network but the explicit and predefined nature of these kinds of rules divide our system of language

in an understandable metaphor. One of a learned vocabulary and rules throughout the use of the

language itself and one of fixed and added metarules to use the language in the best possibleoutputting way for others and yourself to input again.

2.4 Simulating linguistic communication

In the field of artificial intelligence the simulating of language use is perceived as a complex task.

Although rule-based chatbots like ELIZA could pass Turing tests sometimes it still was bound to the

boundaries set by the rules (Cilliers, 1998). Since the connectionism models came into the picture of

simulating language scientists try to weave the complex and advanced system of language into an

ANN. What remains a problem with these kind of networks is the amount of time and computational

power needed to even come close to the same level as rule-based systems like the recent START

project from Boris Katz (Katz et al, 2006; Xu, 2000). By adding prior knowledge to the neural network

scientists like Frederic Morin and Yoshua Bengio (2005) managed to speed up the learning of the

neural language model by a factor of 200. This addition of sets of rules and predefined knowledge

exemplifies what we have already seen in the research of the economic system, but the important

distinction being that these sets of rules are not fuzzy. In fact the data added to the neural network

is fixed knowledge in the form of words and grammar. Therefore this hybrid model is what science

actually is all about, it stands on the shoulder of giants. Proven giants in language that is.

3 Hybrid models for the win

When posing the main question of this article again: how do rule-based models remain useful when

researching the operation of complex systems, we can answer it on different levels. What became


20/41

20

clear in this article is that the models posed in the beginning, the connectionism model and the rule-

based systems, both have their advantages and caveats while using them to research and understand

complex systems.

What we have seen is that by looking at the complexity of a system like the economy and the system

of language both models remain useful. What is more important though is what a combination ofthese two models can mean for research and understanding. When we look at it on the account of

researching complex systems we can see a significant leap made possible by hybrid models like the

neural fuzzy systems. These systems are a typical and clear example of the power of merging both

models into one. Firstly, we have seen some powerful additions to the speed of the hybrid model

compared to the separate models when researching complex systems like economy and language.

Secondly, the addition of fuzzy logic into the artificial neural network can become real logic by the

learning and correcting properties of the network so we can derive a new set of rules from it.

When approaching connectionism models and rule-based systems as a metaphor for complex

systems we can also see an advantage in using both properties. Understanding such a complexsystem as economy in terms of highly interconnected agents on all different levels and layers is a

complex task itself. By extracting already existing and supervening rules out of the economy we can

add them to the understanding of the economy of a connectionism model.

Looking at the complex system of language and what we know about it, or better said, come to know

about it, we can see both models in action. When we are toddlers we hear words and their

combination which forms sentences. Going to school and learning about certain rules influences our

perception and use of the language. Although these rules are partly derived from our own trial and

error and collective understanding, parsing sentences by verbs and indirect objects isnt part of that

neural learning process.

Returning to the overarching iceberg metaphor we can conclude that we need to treat all sides of the

complexity iceberg as equal and turn it upside down and rotating it clockwise and counter-clockwise

to see its full potential; combining the best from both worlds does lead, as in a number of fields, in

complexity and the variety of complex systems that herein exists to more powerful methods in terms

of analysis and description but also in terms of reproduction and representation.

Bibliography

Beltratti, A., Margarita, S., Terna, P. 1996. Neural Networks for Economic and Financial Modelling.

London: International Thomson Computer Press

Bonissone, P., Chen, Y.T., Goebel K., Khedkar, P. 1999. Hybrid soft computing systems: Industrial and

commercial applications. Proceedings of the IEEE 87(9), pp 1641-1667

Chomsky, N. 1968. Language and Mind. New York: Harcourt Brace Jovanovich, Inc.

Churchland, P.S. Neurophilosophy: Toward a Unified Science of the Mind-brain. Cambridge: MIT

Press

Cilliers, P. 1998. Complexity and postmodernism: Understanding complex systems. London:

Routledge


21/41

21

Crescenzio, G., De Stasio, G., Di Letizia, C. 2006. "Artificial Neural Networks in Financial Modelling".

Quaderni DSEMS 02-2006

Freud, S. 1950. Project for a Scientific Psychology. London: The Hogarth Press

Fu, L.M. Connectionism for Fuzzy Learning in Rule-based Expert Systems Last accessed on April 8th, 2009

Haugeland, J. 1985. Artificial Intelligence: The Very Idea. Cambridge: MIT Press

Hayek, F. A. 1982. "The Sensory Order After 25 Years", in Weimer and Palermo, eds, pp 287-93

Hayek, F. A. 1945. "The Use of Knowledge in Society", American Economic Review. XXXV (4): 519-530

Hebb, D. The Organization of Behavior. New York: Wiley

Katz, B., Borchardt, G., Felshin, S. 2006. Natural Language Annotations for Question Answering.

Proceedings of the 19th International FLAIRS Conference.

Ledezma A., Berlanga, A., Aler, R. 2001. "Automatic Symbolic Modelling of Co-evolutionarily Learned

Robot Skills", International Workconference on Artificial and Natural Networks (1): 799-806

Morin, F., Bengio, Y. 2005. Hierarchical Probabilistic Neural Network Language Model, AISTATS

(np) Last accessed

on April 8th, 2009

Rumelhart, D.E., McClelland, J.L. 1986. Parallel Distributed Processing: Explorations in the

Microstructure of Cognition. Volume 1: Foundations, Cambridge: MIT Press

Sas, P. 2002. "Computers en de natuurlijke taal van het denken" in Filosifie in Cyberspace. Klement:

Kampen

Searle, J.R. 1980. "Minds, Brains, and Programs", Behavioral and Brain Sciences (3): 417-457

Scott, A. (1995). Stairway to the Mind. New York: Springer

Serra R. and Zanarini G. 1990. Complex systems and cognitive processes. Heidelberg: Springer Verlag

Silvert, W. 2000. "Modelling as a discipline", International Journal of General Systems 0(0): 1-22

Smith, B. 1997. "The Connectionist Mind: A Study of Hayekian Psychology" in Hayek: Economist and

Social Philosopher: A Critical Retrospect (pp 9-29). London: Macmillan

Xu, W., Rudnicky, A. 2000. Can Artificial Neural Networks Learn Language Models?

Last accessed on April 8th, 2009


22/41

22

The Uniform Resource Identifier Revis(it)edWritten by Tom van de Wetering

Abstract

This article claims that the emergence of the World Wide Web has been accompanied by at least four

widespread misinterpretations of its structure. First, scientific dreams on file structures are

misinterpreted as being fulfilled by the very invention of the Web. Second, even after a dozen years

of Web development many researchers from various fields misinterpreted the Web as a collection of

connected documents. Third, the emergence of social software is misinterpreted as a fundamental

change to the structure of the Web. Instead, the underlying technologies of contemporary Web

represent the metaphor "Web2.0" quite well. Fourth, the "Semantic Web" is misinterpreted as being

able to understand human beings. Instead of "Semantic Web", the title "Linked Data" is more

precise. While exploring the differences between linked documents and linked data, the changed

role of the Uniform Resource Identifier receives specific attention. An approach comparable to the

emerging field of Software Studies results in the claim that the URI has lost its monopoly asconnection between web resources while transforming from locator to activator.

Dream Machines Defined v1.0

"It was Berners-Lee who brought all these dreams into reality" (2001, p. 15). Manuel Castells states it

was because Berners-Lee "implented the software that made it possible to retrieve and contribute

information from and to any computer connected via the Internet: HTTP, HTML and URI (later called

URL)" (ibid.), that the dreams of a long research tradition were fulfilled. Among the members of this

twentieth century tradition are Vannevar Bush, Douglas Engelbart and Ted Nelson, who all imagined

"the possibility of linking up information sources via interactive computing" (ibid.). Using this abstract

definition it seems appropriate to declare their dreams were fulfilled thanks to Berners-Lee, and wecan consider the latter as an very important inventor indeed. The questions which can be debated

are whether dreams of interlinked information sources are indeed that easy defined and whether

those dreams were indeed completely fulfilled when the web was introduced.

If we take a closer look at Ted Nelson's 'dreams', of which some are explained in his text "A File

Structure for the Complex, the Changing, and the Indeterminate" (1965), we experience similarities

with the earlier defined Web. Nelson's proposition for a file structure that suits complex tasks is

characterized as "a simple building block structure, user-oriented and wholly general-purpose"

(p.134). Further, "no hierarchical file relations were to be built in; the system would hold any shape

imposed on it" (p.137). Where HTML suits a "simple building block structure", Berners-Lee's otherinvention, the Hypertext Transfer Protocol, suits a non-hierarchical structure. While it is tantalizing to

make such a comparison, looking at the differences between the definitions reveals definitely more.

Where Berners-Lee invented a file structure to organize relations between documents, Nelson's main

concern was to invent a file structure to organize relations within documents. Where the first Web

browsers, using Berners-Lee's HTML, HTTP and URI, realized the dream of easily accessing thousands

of documents, stored at different locations, the "dream file" providing "the capacity for intricate and

idiosyncratic arrangements, total modifiability, undecided alternatives, and thorough internal

documentation" (p.134) still needed to be invented.

In short, conflicting definitions of 'file structures' become problematic when documenting whichlong lasting dreams have been fulfilled. Even more importantly, the same is true for researching file


23/41

23

structures. Ever since its invention, the Web became not only ubiquitous in society, but also in

science as an object of study. Castells' misinterpretations of scientific dreams became a tiny research

flaw, but a wrong definition of the Web when the Web is the prime object of study, could make the

project less useful. As the problem of weak definitions is specifically true for changing objects, and

we take in mind the Web as an object that changes frequently, it is useful to consider whether the

'file structure' of the Web has changed ever since its invention and how Web studies deal with such

changes. Here this is done in reverse order: I will introduce influential Web studies first, before

criticizing them by stating that the structure of the Web has changed over the years, and thus their

object of study.

Web Research Defined v1.1

Web studies can be divided loosely into two categories. First those who treat the structure of the

web as a given assumption, while researching another phenomenon. Among many studies to the

philosophy, the geography and the culture of the Web, the field of hyperlink network analysis (HNA)

implements basic definitions of the structure of the Web as assumptions into research projects, for

example: "the basic structural element of the Internet is the hyperlink" (Park, 2003: 49). Park shows

how HNA can be seen as an sub-division of social network analysis, in the case a "website is regarded

as an actor and the hyperlink among sites represents a relational connection or link" (p.50). While

Park is an enthusiast of the computer-assisted measurement method, which allows automated data

gathering, his awareness of the methodological consequences of HNA appears in the form of a dozen

critical questions, for example: "are meaningful communication relations being maintained or

transmitted via hyperlinks [and] what does the location of websites in the hyperlink networks

mean?" (p.58). Both questions show there is a need to study the structure of the Web.

The Web-dedicated part of network science is an example of a field more directly concerned with

the structure of the Web. Network scientists are especially interested in the behaviour of networked

structures themselve, before trying to say something about what specific structures represent. For

example, the behaviour of the Web as a network is investigated before, or even without, a statement

is made on what this behaviour means for objects related to the Web. While trying to map the

underlying architecture of the Web, Albert-Lszl Barabsi discovered some interesting properties.

The structure of the Web, and a variety of other complex systems, is "scale free", that is, "some

nodes have a tremendous number of connections to other nodes, whereas most nodes have just a

handful" (2003: 52). Looking on Barabsi's methodology enveils a similar approach as that of HNA

scholars. Nodes are defined as Web pages and the connections between those nodes are

represented by hyperlinks. To which extent this is a correct observation of the structure of the Webis in principle not very important to the field of network science. The network and what it represents

do not have to match completely to discover useful network properties and effects. It becomes more

dangerous when network scientists try to declare something about the object they represented in

the form of a network, without defining that object correctly. For example, when Barabsi declares

that he discovered that a few Web pages contain lots of hyperlinks which make it possible to reach a

lot of other pages without many links, he is right, but stating that "a few highly connected pages are

essentially holding the World Wide Web together" (2003: 52), requires a correct definition of what

the Web is, how it is hold together, what pages are and how they are connected.

This study can be seen as an attempt to further investigate Park's questions and to develop a betterunderstanding of how the Web is structured. I will not propose a new set of justified definitions,


24/41


25/41

25

following: "whether we are browsing a web site, use Gmail, play a video game, or use a GPS-enabled

mobile phone to locate particular places or friends nearby, we are engaging not with pre-defined

static documents but with the dynamic outputs of a real-time computation" (2008: 19). In other

words: contemporary web sites are not a collection of connected static HTML documents, but a

much more complex system of real time computations and HTML documents are not directly

transported from a server to a client (a browser) which tries to display them. This facts have

inevitable consequences for our understanding of how the Web works. Where much attention is paid

to the hardware which facilitates the Web, the study of the involved software needs more

investigation.

Without trying to grasp the full arsenal of contemporary Web software, three examples are critical

for the structure of the Web and need therefore strong attention. What happens when a browser

tries to access a typical web page (for example:"http://www.networktheory.nl/view/test.html") is

loosely described as follows. The domain name "networktheory.nl" is reached just as it were 1990:

on this level we do not expect structural changes. Then, in the original situation of 1990, the server is

asked to deliver a document named "test.html" that is stored in sub folder "view" on the server, but

typically for recent web sites, the subfolder and the document do not physically exist. Instead, the

server redirects the browser to the root folder, where the file "index.php" is opened. This single file,

a script, replaces all documents previously stored physically in server folders. The script, which acts in

collaboration with a lot of other scripts, connects to a MySQL database. This kind of software

contains all of the data needed for the construction of the requested web page and is stored into

defined pieces. The script retrieves the needed pieces and sends them to the browser in the form of

a HTML document containing pieces of eXtensible Markup Language. Using the browser's JavaScript

engine, the user is able to view, update, move, sort, edit, remove and mix up the pieces of XML

content and information on such user-actions is stored in the MySQL database.

Semantic Web Defined v2.2

Scripts, databases and XML (which is a database in the form of a document), are obviously used by

the majority of websites to date and facilitate the emergence of "Web2.0". Regarding to Van den

Boomen (2007) Web2.0 is a metaphor which refers to websites where the user is "in control".

Popular Web2.0 websites like Wikipedia, Facebook and Blogger are dependent on the combination of

scripts, databases and XML and use technologies on top of these foundations to become "user

controlled". RSS-feeds are built on top of XML to allow content to flow around other web sites. API's

are built on top of databases to allow other web sites to retrieve and store data. Complex user scripts

are written on top of general scripts to allow users to add profiles and other data to the website. It isimportant to note that Web2.0 is built on top of a common set underlying technologies. However, it

is even more important to realize that the Web did not require "user-control" to become next-

generation, or "2.0". The fluent, but ubiquitous transition of a Web consisting of static HTML-

documents which are stored physically on servers, to a Web consisting of XHTML-documents which

are dynamically created through interactions between various scripts and databases, brought the

Web to a fundamentally different structure.

If we do associate the Web2.0 metaphor with the actual development of the Web technique, the

underlying software, instead of a particular group of (social) practices built upon that technique, then

various efforts to conceptualize the Semantic Web deserve credibility as first attempts to a Web"version 2.0". As mentioned earlier, authors like Bush (1945) and Nelson (1965) were already


26/41

26

dreaming of file systems which were not actualized by the very invention of the Web, but are closely

related to dreams of a Semantic Web. While "revisiting" a previous conceptualization of the Semantic

Web (2001), Berners-Lee et al. describe the concept as "a Web of actionable information --

information derived from data through a semantic theory for interpreting the symbols" (2006: 96). In

words more familiar to Web development: a "semantic theory for interpreting the symbols" would

be the process of standardization. Where "the use of HTTP within the CERN particle physics

community led to the revolutionary success of the original Web" (linked documents) standardization

is again the key instrument to allow the emergence of linked data.

URL to URI 3.0

Berners-Lee et al. describe a set of Web standards to be contributing to a web of linked data. They

describe the Web Ontology Language (OWL) which offers "greater expression in [...] object and

relation descriptions" between different data sets. The Resource Description Framework (RDF) is a

standard to specify relations between pieces of XML-documents, another semantic Web standard,

through assigning "specific Uniform Resource Identifiers (URIs) to its individual fields" (p. 97). To

understand how RDF works, it can be compared to bibliographical references, pointing to resources a

specific field (a citation) is related to. However, it may be difficult to trace the resource of a

bibliographical reference, because it may be out of stock, it may have changed over time or it may be

incorrectly identified. The URI, often inadequately described as "hyperlink" or "Universal Resource

Locator" (URL), may return identical problems as the bibliographical reference, which reduces the

semantic results the RDF standard could deliver to the Web. For Berners-Lee et al., and for

researchers like Castells and Park who limit the URI's definition still to a "hyperlink" between

documents, it is important to be aware of the changing role of the URI.

The Internet Society, the administer of Internet protocols, observes that ambiguity too, as is

described in document written by Berners-Lee et al. to define the general syntax of the URI:

"A common misunderstanding of URIs is that they are only used to refer to accessible resources. The

URI itself only provides identification; access to the resource is neither guaranteed nor implied by the

presence of a URI. Instead, any operation associated with a URI reference is defined by the protocol

element, data format attribute, or natural language text in which it appears" (2005)

The current URI standard offers opportunities that reach further than accessing documents, as can

be interpreted of the following scheme:

foo://example.com:8042/over/there?name=ferret#nose

\_/ \______________/\_________/ \_________/ \__/

| | | | |

scheme authority path query fragment

Contemporary websites make exhaustive use of the query-functionality of the URI standard as

described earlier in an example of a typical process of Web page retrieval. The only document

directly accessed using URIs is the index.php file. The "path" is subsequently virtualised by the

underlying scripts of the website system. The resource identified is stored in a database, but asmentioned by Berners-Lee et al.: "access to the research is neither guaranteed nor implied" (ibid.).


27/41

27

This very statement has consequences for fields like HNA, who cannot declare there is a relation

between a specific URI and a specific resource without testing whether the connection between the

indentifier and the resource really exists. But more importantly, the change from server-stored

documents to database-stored data, redefined the ontology of the resource. In a practical sense it is

sometimes arguable to relate a specific URI, like http://www.networktheory.nl/user/john/, to the

document the browser displays when asking the browser to request that URI. The facts are different:

the only resource requested is the index.php script, which controls in cooperation with other scripts,

which output is assembled using data stored in the database. For example, the index.php script can

be designed to always deliver exactly the same output, no matter what the URI identifier behind the

"authority". In the old situation such a trick would require multiple copies of documents, which are

from an ontological perspective identical, but different documents and resources.

Practically, the situation of linked data simulates the situation of linked documents, especially if we

look at the output displayed by a browser which is still an HTML document most of the time.

Practically, the current situation offers advantages compared to the situation of linked documents.

For example, the index.php is still able to output a document, even in the case a misspelled URI is

accessed. The most profound change is therefore found in another practice built upon Web

standards: practices realizing Berners-Lee's wildest dreams of linked data.

In short, a URI accessing a index.php script, which relates the data contained in that URI to data

contained in the database, is already a form of linked data. Two sources of data are linked and form

together an output. However, the power of the concept of linked data, is that practically every data

source can be linked to others at the same time. This data sources can be other Web databases using

Application Programming Interfaces (API), data output transmitted via interfaces like the human-

computer interfaces (HCI), GPS-interfaces and many other sources. In fact, like Cramer and Fuller put

it: "the distinction between a 'user interface', an 'Application Program Interface' (API), and a

computer control language is purely arbitrary" (2008: 150). In other words: every available data

source can be useful for script-actors, and their programmers, to construct an assemblage to output

to the browser.

To conclude: the URI is far from the only "resource identifier" on the Web. Instead, many other sets

of data relate to resource assemblages, stored in various databases. Such identifiers play an

increasingly role in the process of, formerly called, resourc

gardens of control - journal of network theory

Documents