the fat-stack in network-on-chips (nocs): new structure for interconnection networks

5
 The Fat-Stack in Network-on-chips (NOCs): New S tructure for interconn ection networks Reza Kourdy Department of Computer Engineering Islamic Azad University, Khorramabad Branch, Iran Mohammad Reza Nouri rad Department of Computer Engineering Islamic Azad University, Khorramabad Branch, Iran Abstract  — this paper shows that a novel network called the fat-stack is efficient and is suitable for use as a baseline distributed network and as crucial benchmark architecture for evaluating the performance of specific distributed networks. The fat-stack structure makes the network scalable to closely represent a distributed network. We show that the fat-stack is efficient by proving it is universal. A requirement for the f at-stack to be universal is that link capacities double up the levels of the network.. Index Terms  — Networks-on-Chip (NoC), Fat-stack, augmented fat-stack (AFS), general fat-stack (GFS), augmented fat-stack (AFS). ——————————   —————————— 1 INTRODUCTION etworks-on-Chip (NoC) have been proposed as a promising solution to multi-processor on-chip communication problems. To catalyze the deploy- ment of the NoC paradigm for many high performance computational applications, many challenging research problems of NoC design abstractions need to be ad- dressed at all levels. The active problems in the field of NoC design include: design space exploration of NoC architecture for applications, application scheduling and mapping algorithms, evaluation of switching, topology or routing algorithm for efficient execution of application, and optimization of communication cost, area, and power [1]. The topology of a network determines its efficiency on the first order. Network architecture can be considered as consisting of a distinct topology, varied link capacities, and a specific routing scheme. An architecture resembling practical networks provides not only a working model but also a foundation for studying different networks and contriving novel network services. In this paper we prove analytically that a certain architecture is the best suitable for distributed networking and can be used as a bench- mark to evaluate the performance of specific network topologies. The proofs are based on routing results and hardware layouts developed in the context of intercon- nection networks for parallel computers. It is both theo- retically significant and desirable to ensure the proving premises and results to be valid across scales. We show how to scale a VLSI network up to represent a distributed network such that routing properties are retained.  An efficient network should move traffic speedily for the computing task and require no excessive hardware to build. We show that the fat-stack is such an efficient net- work by showing that it is universal, i.e. it can simulate any other network with an overhead of no more than (some power of) the logarithm of the area A of the hard- ware containing the network. This universality result im- plies that the fat-stack performs much better than or as well as most, if not all, of known networks. The choice of the term “fat-stack” stems from the observation that the network is a construct of identical atomic sub-network units stacked up and tapering upwards fast. The fat-tree is the first proved universal network [2]. But it is universal only under unit wire delay condition; its universality does not hold under non-unit wire delays [3]. The fat-pyramid has been proven to be universal under both unit and non-unit wire delay conditions [3]. The fat- tree has been used in the CM-5 parallel computer whereas the fat-pyramid has not been adopted for any machine. Another clear advantage of the fat-pyramid over the fat- tree is its better absolute efficiency due to its hierarchical meshes.  But these same meshes of the fat-pyramid increase its wire usage considerably and make it not scalable to represent a distributed network. The fat-stack is relatively simplistic in structure, which makes it scalable to closely represent a distributed net- work. It can be constructed by stacking up atomic sub- network units following a fat-tree framework. A sub- network unit is made of a ring of certain nodes and one or more upward links each from one node of the unit. These links connect to the same node of a sub-network right above the unit. The network is built up recursively. We consider two variants of the fat-stack in this paper. One has only one upward link from a sub-network and the top level node is omitted. This variant is not strictly based on a tree due to the omission of some links. We refer to this variant as the general fat-stack (GFS) which is the main focus of this paper. The second variant has as many up- N JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617 https://sites.google.com/site/journalofcomputing WWW.JOURNALOFCOMPUTING.ORG 18

Upload: journal-of-computing

Post on 05-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

7/31/2019 The Fat-Stack in Network-on-chips (NOCs): New Structure for interconnection networks

http://slidepdf.com/reader/full/the-fat-stack-in-network-on-chips-nocs-new-structure-for-interconnection 1/5

 

The Fat-Stack in Network-on-chips (NOCs):

New Structure for interconnection networks

Reza Kourdy

Department of Computer EngineeringIslamic Azad University,

Khorramabad Branch, Iran

Mohammad Reza Nouri rad

Department of Computer EngineeringIslamic Azad University,

Khorramabad Branch, Iran

Abstract — this paper shows that a novel network called the fat-stack is efficient and is suitable for use as a baseline distributed

network and as crucial benchmark architecture for evaluating the performance of specific distributed networks. The fat-stack

structure makes the network scalable to closely represent a distributed network. We show that the fat-stack is efficient by

proving it is universal. A requirement for the fat-stack to be universal is that link capacities double up the levels of the network..

Index Terms — Networks-on-Chip (NoC), Fat-stack, augmented fat-stack (AFS), general fat-stack (GFS), augmented fat-stack

(AFS).

——————————    ——————————

1 INTRODUCTION

etworks-on-Chip (NoC) have been proposed as apromising solution to multi-processor on-chipcommunication problems. To catalyze the deploy-

ment of the NoC paradigm for many high performancecomputational applications, many challenging researchproblems of NoC design abstractions need to be ad-dressed at all levels. The active problems in the field of

NoC design include: design space exploration of NoCarchitecture for applications, application scheduling andmapping algorithms, evaluation of switching, topology orrouting algorithm for efficient execution of application,and optimization of communication cost, area, and power[1]. The topology of a network determines its efficiencyon the first order. Network architecture can be consideredas consisting of a distinct topology, varied link capacities,and a specific routing scheme. An architecture resemblingpractical networks provides not only a working modelbut also a foundation for studying different networks andcontriving novel network services. In this paper we proveanalytically that a certain architecture is the best suitable

for distributed networking and can be used as a bench-mark to evaluate the performance of specific networktopologies. The proofs are based on routing results andhardware layouts developed in the context of intercon-nection networks for parallel computers. It is both theo-retically significant and desirable to ensure the provingpremises and results to be valid across scales. We showhow to scale a VLSI network up to represent a distributednetwork such that routing properties are retained. An efficient network should move traffic speedily for thecomputing task and require no excessive hardware tobuild. We show that the fat-stack is such an efficient net-work by showing that it is universal, i.e. it can simulate

any other network with an overhead of no more than

(some power of) the logarithm of the area A of the hard-ware containing the network. This universality result im-plies that the fat-stack performs much better than or aswell as most, if not all, of known networks. The choice ofthe term “fat-stack” stems from the observation that thenetwork is a construct of identical atomic sub-networkunits stacked up and tapering upwards fast.

The fat-tree is the first proved universal network [2]. Butit is universal only under unit wire delay condition; itsuniversality does not hold under non-unit wire delays [3].The fat-pyramid has been proven to be universal underboth unit and non-unit wire delay conditions [3]. The fat-tree has been used in the CM-5 parallel computer whereasthe fat-pyramid has not been adopted for any machine.Another clear advantage of the fat-pyramid over the fat-tree is its better absolute efficiency due to its hierarchicalmeshes. But these same meshes of the fat-pyramid increase itswire usage considerably and make it not scalable torepresent a distributed network.

The fat-stack is relatively simplistic in structure, whichmakes it scalable to closely represent a distributed net-work. It can be constructed by stacking up atomic sub-network units following a fat-tree framework. A sub-network unit is made of a ring of certain nodes and one ormore upward links each from one node of the unit. Theselinks connect to the same node of a sub-network rightabove the unit. The network is built up recursively. Weconsider two variants of the fat-stack in this paper. Onehas only one upward link from a sub-network and the toplevel node is omitted. This variant is not strictly based ona tree due to the omission of some links. We refer to thisvariant as the general fat-stack (GFS) which is the main

focus of this paper. The second variant has as many up-

N

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

https://sites.google.com/site/journalofcomputing

WWW.JOURNALOFCOMPUTING.ORG 18

7/31/2019 The Fat-Stack in Network-on-chips (NOCs): New Structure for interconnection networks

http://slidepdf.com/reader/full/the-fat-stack-in-network-on-chips-nocs-new-structure-for-interconnection 2/5

 

ward links as the number of nodes in the sub-network.We refer to this variant as the augmented fat-stack (AFS).

2  NOC FRAMEWORK 

The NoC framework/system consists of five main mod-ules:

i) The processing architectureii) The communication infrastructureiii) The communication paradigmiv) The monitor modulev) The traffic generator module.The NoC processing architecture consists of several mas-ter/slave processing elements (PEs) that are connected tothe communication infrastructure by means of a networkadapter. The PEs can be a master PE or slave PE, depend-ing on whether it can initiate a message transfer or onlyrespond to a request. Only master PEs can initiate a mes-sage transfer. Slave PEs respond to the requests frommaster PE either by sending back the requested sig-

nals/data or by saving the received information. UART,TIMER, and Instruction/Data Memory all are consideredslave PEs, whereas the master PEs used in the design arecapable of performing arithmetic and logical operations.The network adapter receives signals from the PEs andgenerates packets to be sent to the communication infra-structure. Hence, the main function of the adapter mod-ule is to transform the data to and from the format re-quired by underlying infrastructure. The data/message iscommunicated as packets. The entire message can be ei-ther generated as a single packet or the packets can bedivided into flits before actually transmitted.

3 SYSTEM ARCHITECTURE 

Network topology determines the connectivity amongnodes and is therefore a first-order determinant of net-work performance and energy-efficiency.Since the ability of the network to efficiently disseminateinformation depends largely on the topology, we espe-cially focus on different types of Topologies:

4. SIMULATION METHODOLOGY

In this section, simulation of AFS on-chip interconnects isdone by using a simulator developed in [4]. This discrete

event driven simulator is based on ns2 [5] that providesmany facilities to describe network topology, transmis-sion protocols, routing algorithms, and traffics genera-tion. The main objective of using ns2 is to rapidly exploreand evaluate the performance metrics as well as the ener-gy consumption of on-chip interconnects.

4.1. Simulation Details

In this paper, we have modeled our architecture conceptswith the widely used network simulator ns-2 [6]. NS2 hasbeen widely applied in research related to the design andevaluation of computer networks and to evaluate variousdesign options for architectures [7], including the design

of routers, communication protocols, etc.Ns-2 [8] is a discrete event network simulator designed

for simulation of ordinary networks of computers. Asmany models of network components are provided, theuser can simulate at a high abstraction level. Yet, it ispossible to implement new components in the networkmodel. Ns-2 has support for local area networks, mobilenetworks and even satellite networks. Two computer lan-guages are used in ns-2, namely C++ and OTcl.

We would use the tool, Network Simulator ns-2 [9],[10], Which has been extensively used in the research fordesign and evaluation of public domain computer net-work, to evaluate various design options for NOC archi-tecture, including the design of router, communicationprotocol, Routing algorithms.

NS-2 is an open source, object-oriented and discreteevent driven network simulator written in C++ andOTCL. It is a very common and widely used tool to simu-late small and large area networks [11].

All of the topology parameters can be described as ascript file; in Tcl. A part of the ns-2 script file about con-

structing the topology is shown below:

#puts "--------Node Index-------------------------"

for {set i 0} {$i < $sum} {incr i} {set sw([expr ($i)]) [$ns node]

$sw([expr ($i)]) label sw[expr ($i)]if ($i==0){$sw([expr ($i)]) color blue }if ([expr $i>=1 && $i<=3]){$sw([expr ($i)]) color red }if ([expr $i>=4 && $i<=12]){$sw([expr ($i)]) color #00aa00}

if ([expr $i>=13 && $i<=39]){$sw([expr ($i)]) color brown }if ([expr $i>=40 && $i<=120]){$sw([expr ($i)]) color #696969}if ([expr $i>=121 && $i<=363]){$sw([expr ($i)]) color #ff8c00}if ([expr $i>=364 && $i<=1092]){$sw([expr ($i)]) color #000080}}

#puts "--------Resource Index-------------------------"

for {set i 0} {$i < $sum} {incr i} {

set Res([expr ($i)]) [$ns node]$Res([expr ($i)]) label Res[expr ($i)]$Res([expr ($i)]) shape square

}#Create links (switches-switches circuilar-links)

for {set i 0} {$i < [expr ($sum)/3]} {incr i} {$ns duplex-link $sw([expr ($i*3+1)]) $sw([expr $i*3+2])

1Mb 10ms DropTail$ns duplex-link $sw([expr ($i*3+2)]) $sw([expr $i*3+3])

1Mb 10ms DropTail$ns duplex-link $sw([expr ($i*3+3)]) $sw([expr $i*3+1])

1Mb 10ms DropTail

}

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

https://sites.google.com/site/journalofcomputing

WWW.JOURNALOFCOMPUTING.ORG 19

7/31/2019 The Fat-Stack in Network-on-chips (NOCs): New Structure for interconnection networks

http://slidepdf.com/reader/full/the-fat-stack-in-network-on-chips-nocs-new-structure-for-interconnection 3/5

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ 

WWW.JOURNALOFCOMPUTING.ORG 3

5. SIMULATION RESULTS 

In this section, we present the Simulation of NoC withdifferent levels with the topology NOC- Fat-Stack and wesurvey the ability and flexibility of ns2 in NOC-WK-recursive (network-on-chip-WK-recursive) simulations.Figures 2 to 9, show different views of NOC-AFS simula-tions.

5.1. The 3-levels NOC-AFS

Some of the simulations in which the number of nodes ishigh may have a different view. For example Figures 1 to2, show different views of 3-levels NOC-AFS topologywhich each of them consists of 13 nodes.

5.2. The 4-levels NOC-AFS

Some of the simulations in which the number of nodes ishigh may have a different view. For example Figures 3 to5, show different views of 3-levels NOC-AFS topologywhich each of them consists of 40 nodes.

Fig.2. the 2nd

view of the 3-levels NOC-AFS 

Fig.3. the 1st

view of the 4-levels NOC-AFS 

Fig.4. the 2nd

view of the 4-levels NOC-AFS 

Fig.1. the 1st

view of the 3-levels NOC-AFS 

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

https://sites.google.com/site/journalofcomputing

WWW.JOURNALOFCOMPUTING.ORG 20

7/31/2019 The Fat-Stack in Network-on-chips (NOCs): New Structure for interconnection networks

http://slidepdf.com/reader/full/the-fat-stack-in-network-on-chips-nocs-new-structure-for-interconnection 4/5

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ 

WWW.JOURNALOFCOMPUTING.ORG 4

5.3. The 5-levels NOC-AFS

Some of the simulations in which the number of nodes ishigh may have a different view. For example Figures 6 to9, show different views of 5-levels NOC-AFS topology

which each of them consists of 121 nodes

Fig.7. the 2nd

view of the 5-levels NOC-AFS 

Fig.5. the 3rd

view of the 4-levels NOC-AFS 

Fig.8. the 3rd

view of the 5-levels NOC-AFS Fig.6. the 1st

view of the 5-levels NOC-AFS 

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

https://sites.google.com/site/journalofcomputing

WWW.JOURNALOFCOMPUTING.ORG 21

7/31/2019 The Fat-Stack in Network-on-chips (NOCs): New Structure for interconnection networks

http://slidepdf.com/reader/full/the-fat-stack-in-network-on-chips-nocs-new-structure-for-interconnection 5/5

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 4, APRIL 2012

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ 

WWW.JOURNALOFCOMPUTING.ORG 5

REFERENCES 

[1]   J. Suseela, V. Muthukumar, "Performance Analysis ofWK-Recursive and Torus Routing", In Proceedings ofthe International Conference on Embedded Systemsand Applications. July 2011.

[2]  C. E. Leiserson. Fat-trees: universal networks forhardware-efficient supercomputing. IEEE Transac-tions on Computers, C-34(10):892–901, Oct. 1985.

[3]  R. I. Greenberg. The fat-pyramid and universal paral-lel computation independent of wire delay. IEEETransactions on Computers, 43(12):1358–1364, Dec.1994.

[4]  Y. R. Sun, S. Kumar, and A. Jantsch, "Simulation andEvaluation of a Network On Chip Architecture Usingns2", Proc. The IEEE NorChip Conference, 2002.

[5]  NS, Network Simulator, NS2,http://www.isi.edu/nsnam/ns, accessed June 2008.

[6]  www.isi.edu/nsnam/ns[7]  R. Lemaire, F. Clermidy, Y. Durand, D. Lattard, and

A. Jerraya, “Performance Evaluation of a NoC-BasedDesign for MC-CDMA Telecommunications UsingNS-2,” in The 16th IEEE International Workshop onRapid System Prototyping, Jun. 2005, pp. 24–30.

[8]  Breslau L., Estrin D., Fall K., S. Floyd, J. Heidemann,A. Helmy, P. Huang, S. McCanne, K. Varadhan, YaXu, and Haobo Yu. "Advances in network simula-tion", IEEE Computer, 33(5):59{ 67, May 2000.

[9]  LBNL Network Simulator, http://www-

nrg.ee.lbl.gov/ns/[10] The network simulator-ns-2,available at

http://www.isi.edu/nsnam/ns/[11] M. Ali, M. Welzl, A. Adnan, F. Nadeem , " Using the

NS-2 Network Simulator for Evaluating Network onChips (NoC)" International Conference on EmergingTechnologies, pp.506 – 512, 2006.

Fig.9. the 4th

view of the 5-levels NOC-AFS 

Reza Kourdy received his B.Sc. degree in Com-puter Engineering and his M.Sc. degree in Com-

puter Architecture both from Azad University ofArak, Iran, in 2002 and 2007, respectively. His re-search interests include Network-On-Chip Archi-tecture and Fault-tolerance.

Mohammad Reza Nouri Rad re-ceived his B.Sc. Degree in Comput-er Engineering Software from AzadUniversity of Najafabad, Iran, in2001, and his M.Sc. Degree in Com-puter Software from Azad Univer-sity of Arak, Iran, in 2010. His re-

search interests include Network-On-Chip Architecture and NetworkSecurity. He is Program Committeeof following conferences :•  WICT 2011•  CSNT 2011•  CICN 2011•  SocProS 2011•  CSNT 2012•  CICN 2012•  BIC-TA 2012

JOURNAL OF COMPUTING, VOLUME 4, ISSUE 5, MAY 2012, ISSN 2151-9617

https://sites.google.com/site/journalofcomputing

WWW.JOURNALOFCOMPUTING.ORG 22