o2cdma: enable an energy-efficient ocdma on the chiptzz106/doc/ocdma.pdf · o2cdma: enable an...

4
O 2 CDMA: Enable An Energy-Efficient OCDMA On The Chip Qiaosha Zou, Tao Zhang, Jishen Zhao, Jin Ouyang, Yuan Xie Department of Computer Science and Engineering Pennsylvania State University, University Park, PA 16802 {qszou, zhangtao, juz138, jouyang, yuanxie}@cse.psu.edu Abstract—Optical On-chip Networks (O-OCN) have been proposed as a revolutionary technique by provid- ing unprecedented performance with reasonable power consumption. Specifically, microring-based optical inter- connect is considered as a promising solution due to the compatibility with silicon technology and affordable area overhead. The state-of-the-art O-OCN, however, has hundreds of thousands of microrings and tens of waveg- uides that dissipate much “redundant” power and is vulnerable to the ambient temperature. In this paper, we rethink about the O-OCN implementation and attempt to apply the optical code division multiple access (OCDMA) technology to O-OCN, aiming at reducing the number of microring and waveguide to save power. Our preliminary experimental result shows that the On-chip OCDMA (O 2 CDMA) outperforms Corona with 27.4% performance improvement. I. I NTRODUCTION As the system performance steps from petascale to exascale with thousands of cores, the Moore’s Law is unable to meet the system requirement anymore. To feed such huge number of cores, the traditional electronic on- chip network (E-OCN) should be replaced with some other novel technologies due to its poor scalability and large power consumption. In contrast to E-OCN, optical on-chip networks (O-OCN) has been proposed excellent candidates for the next-generation on-chip net- work [1] [2] [3]. The O-OCN is featured by high speed (speed of light), low transmission power cost, and less long-distance attenuation. Specifically, the microring- based O-OCN becomes the mainstream because of ring’s simplicity and scalability. Correspondingly, either pho- tonic crossbar (e.g., Corona and FireFly) or photonic router (e.g., Phastlane and PhoenixSim) has been studied to make use of the physical advantage of photonics and thus improve the system performance and power. Even though the prior works show the potential of O- OCN on the performance and power saving, the perfor- mance and power efficiency of O-OCN are still relatively low. The low power efficiency stems from hundreds of thousands of microrings used as modulator, filter, or photodetector. According to the power breakdown in [4] (shown in Fig. 1), modulator, photodetector (receiver), and tuning are three main factors of O-OCN power consumption. Unfortunately, only a small fraction of photodetectors are in use while others still dissipate power even though they do nothing. On the other hand, a common pitfall is higher modulation rate (e.g., 12.5Gb/s) can produce better performance. Instead, [5] argues that the power penalty caused by modulators can be an impedient for further performance improvement under certain power budget. A trade-off between performance and power consumption should therefore be considered carefully. Moreover, the microring is vulnerable to am- bient temperature changes as well as process variation. Additional power should be paid to so-called trimming that is actually a thermal tuning to make sure the microring can be resonant with the desired wavelength. It is non-trivial to achieve a fine-grained trimming as the number of microring is large. Also, the trimming power should be dependent on the ring count in a small die though some study has found the power is independent on the number of microring when the die is relatively large [6][7]. In general, the large number of microring not only affects system performance but also lowers the power efficiency. Therefore, we need to rethink about the O-OCN architecture to reduce the number of microring so that it can eliminate “redundant” power consumption and still sustains the performance. In this paper, we borrow the idea optical code division multiple access (OCDMA) from local area network (LAN) and introduce a novel O-OCN–O 2 CDMA, to reduce the ring count and thereby reduce the power consumption. The rest of the paper is organized as follows. Section II gives a brief introduction to OCDMA basics as the relevant background. Section III presents our O 2 CDMA architecture in detail. Section IV shows

Upload: vuongthu

Post on 02-May-2018

223 views

Category:

Documents


1 download

TRANSCRIPT

O2CDMA: Enable An Energy-Efficient OCDMAOn The Chip

Qiaosha Zou, Tao Zhang, Jishen Zhao, Jin Ouyang, Yuan XieDepartment of Computer Science and Engineering

Pennsylvania State University, University Park, PA 16802{qszou, zhangtao, juz138, jouyang, yuanxie}@cse.psu.edu

Abstract—Optical On-chip Networks (O-OCN) havebeen proposed as a revolutionary technique by provid-ing unprecedented performance with reasonable powerconsumption. Specifically, microring-based optical inter-connect is considered as a promising solution due tothe compatibility with silicon technology and affordablearea overhead. The state-of-the-art O-OCN, however, hashundreds of thousands of microrings and tens of waveg-uides that dissipate much “redundant” power and isvulnerable to the ambient temperature. In this paper, werethink about the O-OCN implementation and attempt toapply the optical code division multiple access (OCDMA)technology to O-OCN, aiming at reducing the number ofmicroring and waveguide to save power. Our preliminaryexperimental result shows that the On-chip OCDMA(O2CDMA) outperforms Corona with 27.4% performanceimprovement.

I. INTRODUCTION

As the system performance steps from petascale toexascale with thousands of cores, the Moore’s Law isunable to meet the system requirement anymore. To feedsuch huge number of cores, the traditional electronic on-chip network (E-OCN) should be replaced with someother novel technologies due to its poor scalabilityand large power consumption. In contrast to E-OCN,optical on-chip networks (O-OCN) has been proposedexcellent candidates for the next-generation on-chip net-work [1] [2] [3]. The O-OCN is featured by high speed(speed of light), low transmission power cost, and lesslong-distance attenuation. Specifically, the microring-based O-OCN becomes the mainstream because of ring’ssimplicity and scalability. Correspondingly, either pho-tonic crossbar (e.g., Corona and FireFly) or photonicrouter (e.g., Phastlane and PhoenixSim) has been studiedto make use of the physical advantage of photonics andthus improve the system performance and power.

Even though the prior works show the potential of O-OCN on the performance and power saving, the perfor-mance and power efficiency of O-OCN are still relatively

low. The low power efficiency stems from hundreds ofthousands of microrings used as modulator, filter, orphotodetector. According to the power breakdown in [4](shown in Fig. 1), modulator, photodetector (receiver),and tuning are three main factors of O-OCN powerconsumption. Unfortunately, only a small fraction ofphotodetectors are in use while others still dissipatepower even though they do nothing. On the other hand, acommon pitfall is higher modulation rate (e.g., 12.5Gb/s)can produce better performance. Instead, [5] argues thatthe power penalty caused by modulators can be animpedient for further performance improvement undercertain power budget. A trade-off between performanceand power consumption should therefore be consideredcarefully. Moreover, the microring is vulnerable to am-bient temperature changes as well as process variation.Additional power should be paid to so-called trimmingthat is actually a thermal tuning to make sure themicroring can be resonant with the desired wavelength.It is non-trivial to achieve a fine-grained trimming as thenumber of microring is large. Also, the trimming powershould be dependent on the ring count in a small diethough some study has found the power is independenton the number of microring when the die is relativelylarge [6][7]. In general, the large number of microringnot only affects system performance but also lowers thepower efficiency. Therefore, we need to rethink about theO-OCN architecture to reduce the number of microringso that it can eliminate “redundant” power consumptionand still sustains the performance.

In this paper, we borrow the idea optical code divisionmultiple access (OCDMA) from local area network(LAN) and introduce a novel O-OCN–O2CDMA, toreduce the ring count and thereby reduce the powerconsumption. The rest of the paper is organized asfollows. Section II gives a brief introduction to OCDMAbasics as the relevant background. Section III presentsour O2CDMA architecture in detail. Section IV shows

Power(μW)

ModulatorReceiverTuningLaser

474

440

500

181

Fig. 1. Optical Point-to-Point Power ConsumptionData Transmitters

Node i

OCDMAEncoder

E/O

DataReceivers

Fiber (Waveguide)Node j

OCDMAEncoder

E/O

Node k

OCDMAEncoder

E/O

. . .

Node p

OCDMADecoder

O/E

Node q

OCDMADecoder

O/E

. . .

Node r

OCDMADecoder

O/E

Fig. 2. OCDMA Network

the preliminary evaluation results, followed by Section Vthat concludes this paper.

II. OCDMA BASICS

OCDMA has attracted wide attention in LAN. It isenvisioned as the next generation fiber technology thatcan easily provide high bandwidth due to its performancerobustness, simultaneous network access capability, andphase insensitivity. Correspondingly, there have beenmany efforts to take the full advantage of fiber-opticcommunication techniques [8] [9] [10]. As shown inFig. 2, OCDMA consists of transmitter, receiver and anshared medium–fiber. Compared to traditional time divi-sion multiple access (TDMA) and wavelength divisionmultiple access (WDMA), the most significant advantageof OCDMA is that it allows all transmitters to sharethe OCDMA network and send data simultaneously withfull bandwidth. The receiver, on the other end, needs toextract the useful data from the mixed signal to completethe transmission.

OCDMA is able to simplify network managementand control, such as addressing and routing. Conse-quently, OCDMA is an attractive candidate for NoCcommunications. OCDMA communication systems havelower latencies than existing token-based systems (e.g.,Corona). Requiring neither time nor wavelength manage-ment, OCDMA can enable asynchronous NoC withoutcentralized arbitration and control. Such OCN is also free

of packet collisions, leading to more flexible networkdesigns. In addition, OCDMA can potentially achievesignificant performance gain, since there is no need toallocate time or wavelength slots to individual commu-nication nodes.

To guarantee the quality of extracted informationfrom the mixed signal, OCDMA codes are inventedto maximize auto-correlation and minimize cross-correlation result, including modified frequency hop-ping (MFH) codes [11], prime codes, and optical or-thogonal codes (OOCs) [12]. In this work, we adoptOOC that is a family of (0, 1) sequences of length n andweight w which satisfy the auto- (Equation 1, where x∈Cand any positive integer t, t<n) and cross-correlation(Equation 2, where x, y∈C) conditions. Ideally, orthog-onal codes have cross-correlation equal to zero so thatthey don’t interfere with each other. For the simplicity, anotation (n, w, λa, λc) can alternatively represent OOCcodes and some sample codes are shown in Table. I.

n−1∑t=0

xtxt+τ ≤ λa (1)

n−1∑t=0

xtyt+τ ≤ λc (2)

TABLE IEXAMPLES OF OPTIMAL OPTICAL ORTHOGONAL CODES [12].

n Optimal OOC (n, 3, 1, 1)7 {0,1,3}13 {0,1,4}, {0,2,7}19 {0,1,5}, {0,2,8}, {0,3,10}25 {0,1,6}, {0,2,9}, {0,3,11}, {0,4,13}31 {0,1,7}, {0,2,11}, {0,3,15}, {0,4,14}, {0,5,13}37 {0,1,11}, {0,2,9}, {0,3,17}, {0,4,12}, {0,5,18}, {0,6,12}43 {0,1,19}, {0,2,22}, {0,3,15}, {0,4,13}, {0,5,16}, {0,6,14}, {0,7,17}

III. O2CDMA ARCHITECTURE

To apply OCDMA to OCN, we propose O2CDMAarchitecture shown in Fig. 3. First of all, O2CDMA takeadvantage of 3D technology that O2CDMA is stackingon the electronic cores. As shown in Fig. 4, O2CDMAis snake-shaped to connect 64 clusters by a singlewaveguide, which is similar to Corona. Each clusteris composed of four cores with traditional electronicnetwork for local data transmission. One optical stop isdeployed as the gateway between electronic and opticalworld. The E/O (O/E) converter converts electronic(optical) signal to optical (electronic) signal. OCDMAEncoder and Decoder are the new components addedinto O2CDMA. Logic ’1’ should be translated to (0,1)sequence as OOC in the encoding stage, while the data

Fig. 3. 3D View of O2CDMALaser

source

0 1 2 3 4 5 6 7

01

23

45

67

Cluster (0, 0)

Core Core

Core Core

OpticalStop

Cac

hes

&

Inte

rco

nn

ects

Waveguides

TX

Encoder

E/O

RX

Decoder

O/E

Fig. 4. O2CDMA Structure

bit is restored by auto-correlation result in the decoding.Moreover, modulator array and photodetector array areused for the light transmission.

For better understanding, we employ Corona as thebaseline for the comparison and list the differences asfollows:

• CDMA vs. TDMA. Corona is actually built onTDMA since a dedicated arbitration waveguide act-ing as the token ring allows only one cluster to senddata to home cluster at anytime. Distinguished fromCorona, a unique OOC is allocated to each clusterto enable OCDMA.

• Shared data waveguide vs. separate data waveg-uide. All clusters share four waveguides (or 256wavelengths) for data transfer in O2CDMA. In con-trast, Corona needs 64 separate waveguide bundlesbecause of the crossbar implementation.

• Single modulator array vs. multiple modulatorarray. Instead of 63 modulator arrays implementedin Corona (64 rings/array), a single modulator arraycan meet the requirement in O2CDMA. Again,this benefits from wavelength sharing, thereby noseparate multiple write single read (MWSR) busesneeded.

TABLE IISIMULATION PARAMETERS

Clock Rate 1GHzModulation Rate 5Gb/s

Packet Size 32ByteComputing Time 1usPropagation Time 8 clocks

TABLE IIIOPTICAL RESOURCE COMPARISON

Photonic Subsystems Waveguides Ring ResonatorsCorona OCDMA Corona OCDMA

crossbar 256 4 1024K 16Karbitration 2 0 8K 0

clock 1 1 64 64

IV. EVALUATION RESULTS

A. Experiment Setup

In this preliminary experiment, we built our own in-house simulator with OMNETPP. The simulation aimsat comparing the performance and power consumptioncomparison between Corona system and OCDMA-basedsystem. Photonic subsystems including crossbar, arbi-tration and clock are simulated. Table II summarizeall the parameters configuration in our simulation. Wesimplify the communication into two modes: memoryread request and memory write request. Each cluster cangenerate memory write request consecutively, but readrequest can be generated only after the last request hasbeen fulfilled. The write request and read request aregenerated randomly with equal probability. The propa-gation time is at most 8 clock cycles, the same as inCorona[1]. The fairness arbitration scheme described in[1] is also implemented.

B. Experiment Result

Table III shows the waveguides and ring resonatorsnumber comparison between Corona architecture andour proposed OCDMA architecture. It is obviously tosee that after OCDMA is used in NoC, the waveguidesnumber can be significantly reduced because multipleclusters can share one common communication channel.Smaller number of waveguides directly leads to thereduction of microrings number. In Corona system, each4 waveguides bundle is used as a single-read multiple-write channel, so every cluster owns its read channel.In OCDMA, all the clusters can communicate through ashared channel. So the waveguides bundle is a multiple-read multiple-write data channel. Arbitration is not re-quired in OCDMA because the encoded data wouldn’tinterfere with others.

0 5

10 15 20 25 30 35 40 45 50

4 8 16 32 64 128 256

Aver

age

Del

ay (n

s)

Number of Clusters

corona-read ocdma-read corona-write ocdma-write

Fig. 5. Comparison of the Read and Write Performance

Figure 5 compares the read and write performanceof the baseline Corona network and OCDMA-basednetwork. We vary the number of clusters from 4 to256 and exam the read and write delay. First of all,it can be observed in the figure that OCDMA-basednetwork has lower average delays than baseline Coronaconfiguration over all the range. Especially when thesystem has up to 32 clusters, OCDMA-based networkhas better performance than Corona. The average delayof both baseline and OCDMA-based network appear tobe increase much faster when there are large numberclusters. Since we assume the read request is finishedonly when the required data is received by the sender.It is a round time delay. So read delay is larger thanwrite delay. For Corona network, when the receivergets the read request, it needs to wait for arbitration tosend out its data. Besides network traffic delay, extradelay introduced by round robin style arbitration is alsoincluded. When 64 clusters are simulated, about 27.4%performance improvement can be obtained from on-chipOCDMA.

V. CONCLUSION

In this paper, we move the OCDMA from LAN toOCN and propose the O2CDMA to improve the powerefficiency. Compared to Corona, O2CDMA dramaticallyreduces the number of microring and waveguide. Perfor-mance can be improved by reducing the delay caused bycomplex token arbitration. The preliminary results showthat 27.4% performance improvement can be achievedwith 64 clusters system. However, the encoded datamight be too lengthy in OCDMA scheme when clustersnumber increases, and it would impact communicationperformance. Encoding and decoding power consump-tion can also be a problem. Optimized scheme is pro-posed to solve this problem. In stead of supporting all

the cluster to communicate simultaneously, only limitednumber of clusters can use the channel at the same time.In order to support this optimized scheme, hand-shakingprotocol is used before data transmission. Hand-shakingprotocol can be implemented on electronically networkwith ultra short message. OCDMA-based network is abroadcast network. Pre-communication protocol can alsonotify the dedicated receiver to avoid unnecessary powerloss.

REFERENCES

[1] D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. P.Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. G. Beausoleil,and J. H. Ahn, “Corona: System Implications of EmergingNanophotonic Technology,” in ISCA’08, June 2008, pp. 153–164.

[2] M. J. Cianchetti, J. C. Kerekes, and D. H. Albonesi, “Phastlane:A Rapid Transit Optical Routing Network,” in ISCA’09, June2009, pp. 441–450.

[3] A. Biberman, H. Lira, K. Padmaraju, N. Ophir, J. Chan,M. Lipson, and K. Bergman, “Broadband Silicon PhotonicElectrooptic Switch for Photonic Interconnection Networks,”Photonics Technology Letters, IEEE, vol. 23, no. 8, pp. 504–506, April 2011.

[4] N. Binkert, A. Davis, N. P. Jouppi, M. McLaren, N. Murali-manohar, R. Schreiber, and J. H. Ahn, “The Role of Optics inFuture High Radix Switch Design,” in ISCA’11, June 2011, pp.437–447.

[5] A. Biberman, J. Chan, and K. Bergman, “On-chip OpticalInterconnection Network Performance Evaluation Using PowerPenalty Metrics from Silicon Photonic Modulators,” in IITC’10,June 2010, pp. 1–3.

[6] A. N. Udipi, N. Muralimanohar, R. Balasubramonian, A. Davis,and N. P. Jouppi, “Combining Memory and a Controller withPhotonics Through 3D-Stacking to Enable Scalable and Energy-Efficient Systems,” in ISCA’11, San Jose, CA, USA, June 2011,pp. 425–436.

[7] N. C., F. M., and A. V., “Addressing System-Level TrimmingIssues in On-chip Nanophotonic Networks,” in HPCA’11, Feb2011, pp. 122–131.

[8] P. Prucnal, M. Santoro, and T. Fan, “Spread Spectrum Fiber-optic Local Area Network Using Optical Processing,” IEEETransactions on Lightwave Technology, vol. 4, no. 5, pp. 547–554, 1986.

[9] S. Mashhadi and J. A. Salehi, “Code-division Multiple-accessTechniques In Optical Fiber Networks - Part III: OpticalAnd Logic Gate Receiver Structure With Generalized OpticalOrthogonal Codes,” IEEE Transactions on Communications,vol. 54, pp. 1457–1468, 2006.

[10] B. M. Ghaffari and J. A. Salehi, “Multiclass, multistage, andmultilevel fiber-optic CDMA signaling techniques based on ad-vanced binary optical logic gate elements,” IEEE Transactionson Communications, vol. 57, pp. 1424–1432, 2009.

[11] E. Smith, R. Blaikie, and D. Taylor, “Performance enhance-ment of spectral-amplitude-coding optical CDMA using pulse-position modulation,” IEEE Transactions on Communications,vol. 46, pp. 1176–1185, 1998.

[12] F. Chung, J. Salehi, and V. Wei, “Optical orthogonal codes:design, analysis and applications,” IEEE Transactions on Com-munications, vol. 35, pp. 595–604, 1989.