micro computer architecture

The magazine for chip and silicon systems designers

The Academic and Business Marriagep. 152

http://www.computer.org/micro

May/June 2014

Contents | Zoom in | Zoom out Search Issue | Next PageFor navigation instructions please click here

Contents | Zoom in | Zoom out Search Issue | Next PageFor navigation instructions please click here

IEEE Micro (ISSN 0272-1732) is published bimonthly by the IEEE Computer Society.IEEE Headquarters, Three Park Ave., 17th Floor, New York, NY 10016-5997; IEEEComputer Society Headquarters, 2001 L St., Ste. 700, Washington, DC 20036; IEEEComputer Society Publications Office, 10662 Los Vaqueros Circle, PO Box 3014,Los Alamitos, CA 90720. Annual subscription rates: IEEE Computer Society membersget the lowest rates, US$45 (print and electronic). Go to http://www.computer.org/subscribe to order and for more information on other subscription prices. Back issues:members, $20; nonmembers, $148. This magazine is also available on the Web.Postmaster: Send address changes and undelivered copies to IEEE, MembershipProcessing Dept., 445 Hoes Ln., Piscataway, NJ 08855. Periodicals postage is paidat New York, NY, and at additional mailing offices. Canadian GST #125634188.Canada Post Corp. (Canadian distribution) Publications Mail Agreement #40013885.Return undeliverable Canadian addresses to 4960-2 Walker Road; Windsor, ON N9A6J3. Printed in USA.Reuse rights and reprint permissions: Educational or personal use of this material ispermitted without fee, provided suchuse: 1) is notmade for profit; 2) includes this noticeand a full citation to the original work on the first page of the copy; and 3) does not implyIEEE endorsement of any third-party products or services. Authors and their companiesare permitted to post the accepted version of IEEE-copyrighted material on their ownweb servers without permission, provided that the IEEE copyright notice and a fullcitation to the original work appear on the first screen of the posted copy. An acceptedmanuscript is a version which has been revised by the author to incorporate reviewsuggestions, but not the published version with copy-editing, proofreading, and for-matting added by IEEE. For more information, please go to http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html.Permission to reprint/republish this material for commercial, advertising, or promo-tional purposes or for creating new collective works for resale or redistribution must beobtained from IEEE by writing to the IEEE Intellectual Property Rights Office,445 Hoes Lane, Piscataway, NJ 08854-4141 or [email protected] # 2014 IEEE. All rights reserved.Abstracting and library use: Abstracting is permitted with credit to the source.Libraries are permitted to photocopy for private use of patrons, provided theper-copy fee indicated in the code at the bottom of the first page is paid throughthe Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.Editorial: Unless otherwise stated, bylined articles, as well as product and service descrip-tions, reflect the authors or firms opinion. Inclusion in IEEE Micro does not necessarilyconstitute an endorsement by IEEE or the Computer Society. All submissions are subject toediting for style, clarity, and space. IEEE prohibits discrimination, harassment, and bullying.For more information, visit http://www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

May/June 2014 Volume 34 Number 3

Features

4 Guest Editors Introduction: Top Picks from the 2013 ComputerArchitecture ConferencesMithuna S. Thottethodi and Shubu Mukherjee

8 Designing and Managing Datacenters Powered byRenewable EnergyI ~nigo Goiri, William Katsak, Kien Le, Thu D. Nguyen, andRicardo Bianchini

17 Quality-of-Service-Aware Scheduling in HeterogeneousDatacenters with ParagonChristina Delimitrou and Christos Kozyrakis

31 A Case for Specialized Processors for Scale-Out WorkloadsMichael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos,Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian DanielPopescu, Anastasia Ailamaki, and Babak Falsafi

43 Smart: Single-Cycle Multihop Traversals over a SharedNetwork on ChipTusharKrishna,Chia-HsinOwenChen,Woo-CheolKwon, andLi-ShiuanPeh

57 Networks on Chip with Provable Security PropertiesHassan M.G. Wassel, Ying Gao, Jason K. Oberg, Ted Huffmire,Ryan Kastner, Frederic T. Chong, and Timothy Sherwood

69 Cache Coherence for GPU ArchitecturesInderpreet Singh, Arrvindh Shriraman, Wilson W.L. Fung, Mike OConnor,and Tor M. Aamodt

80 A Configurable and Strong RAS Solution for Die-StackedDRAM CachesJaewoong Sim, Gabriel H. Loh, Vilas Sridharan, and Mike OConnor

91 Decoupled Compressed Cache: Exploiting Spatial Locality forEnergy OptimizationSomayeh Sardashti and David A. Wood

100 Sonic Millip3De: An Architecture for Handheld 3D UltrasoundRichard Sampson, Ming Yang, Siyuan Wei, Chaitali Chakrabarti, andThomas F. Wenisch

109 Hardware Partitioning for Big Data AnalyticsLisa Wu, Raymond J. Barker, Martha A. Kim, and Kenneth A. Ross

120 Efficient Spatial Processing Element Controlvia Triggered InstructionsAngshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, NealCrago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir,Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, and Joel Emer

138 DeNovoND: Efficient Hardware forDisciplined NondeterminismHyojin Sung, Rakesh Komuravelli, and Sarita V. Adve

Departments

2 From the Editor in ChiefTop Picks from 2013

149 AwardsReflections from the 2013 Eckert-Mauchly Award Recipient

152 Micro EconomicsThe Academic and Business Marriage

Cover artwork by GiacomoMarchesiwww.GiacomoMarchesi.com

Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

micro


micro

qqMM

qqMM

qM

QmagsTHE WORLDS NEWSSTAND

qqMM

qqMM

qM


________________

_________________________

_________________________

___

_______

__________

_____________________

EDITOR IN CHIEF

Erik R. AltmanThomas J. Watson Research [email protected]

ASSOCIATE EDITOR IN CHIEF

Lieven EeckhoutGhent [email protected]

ADVISORY BOARD

David H. Albonesi, Pradip Bose, Kemal Ebcioglu,Michael Flynn, Ruby B. Lee, Yale Patt, James E.Smith, and Marc Tremblay

EDITORIAL BOARD

Alper BuyuktosunogluIBM

Pradeep DubeyIntel Corp.

Sandhya DwarkadasUniversity of Rochester

Babak FalsafiEcole Polytechnique Federale de Lausanne

Krisztian FlautnerARM

R. GovindarajanIndian Institute of Science

Shane GreensteinNorthwestern University

Lizy Kurian JohnUniversity of Texas at Austin

Stephen W. KecklerUniversity of Texas at Austin

Margaret MartonosiPrinceton University

Richard MateosianShubu MukherjeeCavium Networks

Toshio NakataniIBM

Vojin G. OklobdzijaNew Mexico State University

Ronny RonenIntel Corp.

Kevin W. RuddUS Naval Academy

Andre SeznecINRIA Rennes

Richard H. SternOlivier TemamINRIA

Mateo ValeroTechnical University of Catalonia

Tilman WolfUniversity of Massachusetts, Amherst

Xiaodong ZhangOhio State University

EDITORIAL STAFFEditorial Management

Molly Gamborg

Contributing Editors

Amber Ankerholz, Thomas Centrella,

Kristine Kelly, Keri Schreiner,

Dale Strok, and Joan Taylor

Director, Products & Services

Evan Butterfield

Senior Manager, Editorial Services

Robin Baldwin

Associate Manager, Peer Review & PeriodicalAdministration

Hilda Carman

Senior Business Development Manager

Sandra Brown

Senior Advertising Coordinator

Marian Anderson

EDITORIAL OFFICE

PO Box 3014, Los Alamitos, CA 90720;

(714) 821-8380; [email protected]

Submissions:

https://mc.manuscriptcentral.com/micro-cs

Author guidelines:

http://www.computer.org/micro

IEEE COMPUTER SOCIETY

PUBLICATIONS BOARD

Vice President

Jean-Luc Gaudiot

Magazine Operations Chair

Paolo Montuschi

Transactions Operations Committee

Laxmi N. Bhuyan

Digital Library Operations Committee

Frank Ferrante

Plagiarism Chair

David S. Ebert

Executive Director

Angela R. Burgess

Members-at-Large

Alain April, Greg Byrd, Robert Dupuis,

Linda I. Shafer, H.J. Siegel, and Per Stenstrom

COMPUTER SOCIETY MAGAZINE

OPERATIONS COMMITTEE

Paolo Montuschi (Chair)

Erik R. Altman, Maria Ebling, Miguel Encarnacao,

Lars Heide, Cecilia Metra, San Murugesan, Shari

Lawrence Pfleeger, Michael Rabinovich, Yong Rui,

Forrest Shull, George K. Thiruvathukal, Ron Vetter,

and Daniel Zeng

MAY/JUNE 2014 1


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


____________

_____________________

__________

_________

_________

___________

___________

_____________

................................................................................................................................................................

Top Picks from 2013

ERIK R. ALTMANThomas J. Watson Research Center

......This double issue features ourannual Top Picks from the microarchitec-

ture conferences held in 2013. I thank

Guest Editors Mithuna S. Thottethodi

and Shubu Mukherjee for their outstand-

ing job in all aspects of running the Pro-

gram Committee and arriving at these

selections. I am also happy to report that

we received a record 101 submissions,

fromwhich 12 were selected for publica-

tion here.

Like last year, it seemed an inter-

esting exercise to compare the topics

of 2013 Top Picks articles with topics

covered in the inaugural 2003 Top

Picks issue. In 2003, Guest Editors

Charles Moore, Kevin W. Rudd, Ruby

B. Lee, and Pradip Bose divided

articles into six categories. I have

assigned this years articles to those

same six categories, as shown in

Table 1. In doing so, only one article

did not seem a good t to any of the

2003 categories. That article focuses

on datacenters, and in 2003 there was

no datacenter or cloud computing

category. (Other articles this year

also touch on datacenters, but have

aspects that t within 2003

categories.)

The inability to continue Dennard scal-

ing has yielded a major increase in articles

in the Unconventional architectures

category, whereas Building on con-

ventional microarchitectures dropped

to zero articles, as did Performance ana-

lysis, with other categories staying

roughly similar.

It is sometimes a point of confusion

about how the Top Picks articles pub-

lished here differ from the original con-

ference publications. Like all IEEE

publications, IEEE Micro requires at

least 30 percent new content over any

previous publication. Top Picks articles

generally meet this requirement via a

three-page summary (in the initial sub-

mission), summarizing the paper and

arguing for the potential of the work to

have long-term impact. (Indeed, for the

upcoming Top Picks to be published in

2015, Program Committee Chairs and

Guest Editors Luis Ceze and Karin

Strauss ask what the citation of your

paper would be if it won the test of

time award in 10 years.) In addition,

IEEE Micro has a 5,000-word limit, so

authors often have to condense their

original paper. As a result, the IEEE

Micro version of Top Picks papers gen-

erally provides more context and a

slightly higher-level overview of the

work, with the original conference

paper serving as a deeper reference for

readers interested in more detail. This

approach inverts the historical practice

of journals providing a more detailed

record of conference papers, but we

think that this Top Picks approach has

served IEEE Microwell.

This Top Picks issue is also unique

among IEEE Micro editions (and possi-

bly among all IEEE Computer Society

publications) in that the Manuscript

Central/ScholarOne reviewing system

is not used for initial submissions. In-

stead, the Program Chairs deploy their

preferred reviewing system. Papers rec-

ommended by the Program Committee

for acceptance are then entered into

Manuscript Central for the nal stages of

processing. This separate reviewing sys-

tem makes it easier to manage the large

volume of submissions.

Why go into this detail about reviewing

software? The IEEE Computer Society

constantly works with Thomson Reuters

the owner of ScholarOne, to improve its

capabilities. As part of that effort, Scholar-

One maintains two websites to suggest

ideas for its reviewing system and to vote

on suggestions of others:

Offer Suggestions: http://scholaroneideas.force.com/

ideaListCustom

Rate Ideas of Others: http://mchelp.manuscriptcentral.com/

ScholarOneIdeas/howto.html

I encourage any of you who author

articles for IEEE Micro, or who serve as

reviewers, to visit these sites and help

improve ScholarOne.

Finally, this issue continues our

recent practice, led by Associate Editor

.......................................................

2 Published by the IEEE Computer Society 0272-1732/14/$31.00c 2014 IEEE

From the Editor in Chief


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


____

____

__________________

__________

in Chief Lieven Eeckhout, of notingmajor

awards. More specically, this issue

includes a column by James Goodman

about the work that led to his Eckert-

Mauchly Award. Jim has many interest-

ing and broad-ranging observations about

his life and career, and I hope you will

enjoy it asmuch as I did.

With that, as with the Top Picks

articles, happy reading!

Erik R. Altman

Editor in Chief

IEEEMicro

Erik R. Altman is the manager of the

Dynamic Optimization Group at the Tho-

mas J. Watson Research Center. Con-

tact him at [email protected].

Table 1. Mapping 2013 Top Picks articles to 2003 Top Picks categories.

Category

No. of 2003

articles in

category

No. of 2013

articles in

category Articles in this issue

Unconventional

architectures

3 7 A Case for Specialized Processors for Scale-Out Workloads

Smart: Single-Cycle Multihop Traversals over a Shared Network

on Chip

Efficient Spatial Processing Element Control via Triggered

Instructions

DeNovoND: Efficient Hardware for Disciplined Nondeterminism

Networks on Chip with Provable Security Properties

Sonic Millip3De: An Architecture for Handheld 3D Ultrasound

Hardware Partitioning for Big Data Analytics

Power- and

temperature-aware

design

2 2 Designing and Managing Datacenters Powered by

Renewable Energy

Decoupled Compressed Cache: Exploiting Spatial Locality for

Energy Optimization*

Reliability 2 1 A Configurable and Strong RAS Solution for

Die-Stacked DRAMCaches*

Cache, memory,

and multiprocessor

optimizations

4 3 Cache Coherence for GPU Architectures

Decoupled Compressed Cache: Exploiting Spatial Locality for

Energy Optimization*

A Configurable and Strong RAS Solution for Die-Stacked

DRAMCaches*

Building on conventional

microarchitectures

2 0 N/A

Performance analysis 2 0 N/A

None of the above 0 1 Quality-of-Service-Aware Scheduling in Heterogeneous

Datacenters with Paragon...................................................................................................................................*These articles fit in two categories from 2003.

.............................................................

MAY/JUNE 2014 3


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


___________

_______

_____________

Guest Editors Introduction................................................................................................................................................................................................................

TOP PICKS FROM THE 2013COMPUTER ARCHITECTURE

CONFERENCES......It gives us great pleasure to intro-duce the special issue of the top picks fromthe computer architecture conferences of2013. The special issue presents a selectionof 12 papers that describe novel, excitingresearch directions in areas as diverse asdesign of datacenters, processors and acceler-ators, networks on chip, programmability-enhancing frameworks, and emerging largecaches.

The review processWe received a total of 101 submissions.

The full program committee of 30 members(see the sidebar The Selection Committee)reviewed all submissions. Each paper receivedat least four reviews (with many receiving vereviews) from the program committee. Incases where one Selection Committee chairhad a conict of interest with a submission,the other chair handled the review assign-ments. There were no papers on which bothSelection Committee chairs had conicts. Inaddition to the Selection Committee reviews,four external reviews were also sought forunique cases where we felt specic outsideexpertise was needed. Papers with high var-iance in scores were also targeted for addi-tional online discussion and, in some cases,additional reviews. We thank the committeeand the external reviewers for their time andeffort toward this valuable service to the com-puter architecture community.

Note that, in addition to papers publishedin 2013, selected papers published in 2012

were also eligible for inclusion in this yearsissue of Top Picks because of the conict han-dling rules of Top Picks. Under these rules,the selection committee chairs may not sub-mit their own papers in the year they serve aschair. However, their papers are eligible forfull consideration in the following year.

We selected 41 top-ranked papers (basedon the average overall merit score for eachpaper) for discussion at the PC meeting. Fur-thermore, to minimize the impact of varia-tions in reviewer generosity, we veried thatthe 41 papers included the top-ranked papersof most individual committee members. Weencouraged the committee to championother papers for discussion that may havebeen among the top papers in their assignedreviews if such papers had not automaticallyqualied for discussion based on the overallscore. Consequently, one additional paperwas added to the discussion list, taking thetotal to 42.

The Selection Committee discussed all 42papers at a meeting in Boston on 10 January(with 28 members attending physically andtwo participating via teleconference). Com-mittee members with conicts left the roombefore papers were discussed. The meetingwas conducted in two phases. In the rstphase, the committee voted to accept orreject papers without regard to the total num-ber of papers with the explicit understandingthat we may overshoot the target. In the sec-ond phase, the committee revisited the spe-cic shortlisted papers to arrive at the nallist of 12 papers (see the Top Picks of 2013

Mithuna S. Thottethodi

Purdue University

Shubu Mukherjee

Cavium

.......................................................



micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


sidebar). We congratulate the authors on thiswell-deserved accolade.

The selected papersThe selected papers are responsive to

many of the pressing problems that we facetoday. The emergence of cloud computingfueled by social media networks is leading toinnovations in datacenters. The continuousneed to improve the energy efciency of theseclouds of processors, memory, and disks hasled to high performance-per-watt mecha-nisms, such as accelerator engines, betterscheduling of datacenter resources, and newstyles of processor, cache, memory, and net-work design more suited for datacenters andfuture workloads. Security continues to be anoverriding concern in this world of publicclouds and mobile computing, which has ledto innovation in the security architecture oftodays processors. As co-chairs of this IEEEMicro Top Picks issue, we are excited topresent to our audience a glimpse of howarchitects envision solving todays challengingcomputing problems.

Maximizing the use of renewable energyto power these large datacenters is importantfrom a sustainability perspective. Designingand Managing Datacenters Powered byRenewable Energy by I~nigo Goiri et al.responds to this challenge by developingstrategies to optimally use renewable energy

from sources that fall under the commonlyused colocation/self-generation model.

In addition to energy efciency, it isimportant to efciently schedule availablehardware resources to maximize per-formance in datacenters, especially in chal-lenging environments where hardware istypically heterogeneous (due to rollingupgrades), and application performance isinterference prone. In Quality-of-Service-Aware Scheduling in Heterogeneous Data-centers with Paragon, Christina Delimi-trou and Christos Kozyrakis develop anovel scalable scheduling technique that isheterogeneity and interference aware to sig-nicantly boost performance (compared toan oblivious scheduling approach).

Although the computing landscape haschanged dramatically from a desktop-and-local-software regime to cloud-based com-puting, processor designs have more or lessremained the same. A Case for SpecializedProcessors for Scale-Out Workloads byMichael Ferdman et al. argues that there is amismatch between modern processor hard-ware and the requirements of emerging cloudworkloads. This work suggests directions inprocessor design for emerging cloud work-loads. (The conference version of this paperwas published in 2012; but it was eligible forTop Picks this year, per the conict handlingrules we described earlier.)

..............................................................................................................................................................................................

The Selection Committee Tor Aamodt, University of British Columbia David Albonesi, Cornell University David August, Princeton University Rajeev Balasubramonian, University of Utah Pradip Bose, IBM Doug Burger, Microsoft John Carter, IBM Joel Emer, Intel and Massachusetts Institute of Technology Babak Falsafi, Ecole Polytechnique Federale de Lausanne Antonio Gonzalez, Intel Sudhanva Gurumurthi, University of Virginia and Advanced Micro

Devices

Dan Jimenez, Texas A&M University David Kaeli, Northeastern University Alvin Lebeck, Duke University Hsien-Hsin Lee, Georgia Institute of Technology

Gabriel Loh, Advanced Micro Devices Margaret Martonosi, Princeton University Kathryn Mc Kinley, Microsoft and University of Texas at Austin Milo Martin, University of Pennsylvania Trevor Mudge, University of Michigan Satish Narayanaswamy, University of Michigan Eric Rotenberg, North Carolina State University Karu Sankaralingam, University of WisconsinMadison Yanos Sazeides, University of Cyprus Simha Sethumadhavan, Columbia University Andre Seznec, INRIA Dan Sorin, Duke University Dean Tullsen, University of California, San Diego T.N. Vijaykumar, Purdue University Sudhakar Yalamanchili, Georgia Institute of Technology

.............................................................

MAY/JUNE 2014 5


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


Given that most of the cloud servers aremulticore servers and given the increasingimportance of the network-on-chip (NoC)fabric in such servers (the NoC latency is onevery L1 cache miss path), the performanceof the NoC becomes critical. In Smart: Sin-gle-Cycle Multihop Traversals over a SharedNetwork on Chip, Tushar Krishna et al.design an NoC that opportunistically by-passes multiple routers in a single cycle in theabsence of contention. Under ideal condi-tions, the router effectively mimics thelatency of a fully connected network even th-ough the packets traverse several hops.

To ensure privacy and to prevent informa-tion leakage through timing channels, it isimportant to provably ensure complete tim-ing isolation. Networks on Chip with Prov-able Security Properties by Hassan M.G.Wassel et al. solves this problem for NoCs.Unlike prior QoS approaches (where a guar-anteed minimum performance is adequate),the provable timing isolation shown in thisarticle achieves stronger isolation to ensurethat there are no timing interactions amongdifferent domains.

As GPUs move toward providing moresophisticated memory models, the lack ofviable coherence implementations remains astumbling block. Cache Coherence for

GPU Architectures by Inderpreet Singhet al. argues that revisiting the idea of tempo-ral coherence might hold the key to efcientcache coherence implementations for GPUarchitectures.

Die-stacked DRAM, which is on the cuspof widespread adoption, has received signi-cant attention regarding its role in the mem-ory hierarchy. However, little attention hasbeen paid to its RAS characteristics. Jae-woong Sim et al., in their article A Congu-rable and Strong RAS Solution for Die-Stacked DRAM Caches, show that ratherthan carrying over RAS solutions from tradi-tional DRAM, novel RAS solutions that arecustomized for die stacked DRAM arepreferable.

Last-level caches are a precious resourceand, as such, there is strong motivation to usecompression to squeeze out more effectivecapacity. The article Decoupled Com-pressed Cache: Exploiting Spatial Locality forEnergy Optimization by Somayeh Sardashtiand David A. Wood overcomes key limita-tions of prior compression techniques interms of fragmentation and tag limits by lev-eraging decoupled organization.

In the context of domain specic comput-ing, Richard Sampson et al. develop a low-power, high-performance solution for 3D

..............................................................................................................................................................................................

Top Picks of 2013 Designing and Managing Datacenters Powered by Renewable

Energy by I~nigo Goiri, William Katsak, Kien Le, Thu D. Nguyen,

and Ricardo Bianchini

Quality-of-Service-Aware Scheduling in Heterogeneous Datacen-ters with Paragon by Christina Delimitrou and Christos Kozyrakis

A Case for Specialized Processors for Scale-Out Workloads byMichael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos,

Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian

Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi

Smart: Single-Cycle Multihop Traversals over a Shared Networkon Chip by Tushar Krishna, Chia-Hsin Owen Chen, Woo-Cheol

Kwon, and Li-Shiuan Peh

Networks on Chip with Provable Security Properties by HassanM.G. Wassel, Ying Gao, Jason K. Oberg, Ted Huffmire, Ryan Kast-

ner, Frederic T. Chong, and Timothy Sherwood

Cache Coherence for GPU Architectures by Inderpreet Singh,Arrvindh Shriraman, Wilson W.L. Fung, Mike OConnor, and Tor

M. Aamodt

A Configurable and Strong RAS Solution for Die-Stacked DRAMCaches by Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, and

Mike OConnor

Decoupled Compressed Cache: Exploiting Spatial Locality forEnergy Optimization by Somayeh Sardashti and David A. Wood

Sonic Millip3De: An Architecture for Handheld 3D Ultrasoundby Richard Sampson, Ming Yang, Siyuan Wei, Chaitali Chakra-

barti, and Thomas F. Wenisch

Hardware Partitioning for Big Data Analytics by Lisa Wu,Raymond J. Barker, Martha A. Kim, and Kenneth A. Ross

Efficient Spatial Processing Element Control via TriggeredInstructions by Angshuman Parashar, Michael Pellauer, Michael

Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov,

Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon,

Rachid Rayess, Stephen Maresh, and Joel Emer

DeNovoND: Efficient Hardware for Disciplined Nondeterminismby Hyojin Sung, Rakesh Komuravelli, and Sarita V. Adve

..............................................................................................................................................................................................

GUEST EDITORS INTRODUCTION

............................................................

6 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


ultrasound in their article, Sonic Millip3De:An Architecture for Handheld 3D Ultra-sound. Beyond the immediate application of3D ultrasound imaging, the article is a casestudy for accelerator design. The solution,which relies on hardware-algorithm codesign,develops a new accelerator architecture tobring the 3D beamforming problem withinthe desired performance/power envelope.

Continuing with the same theme of novelaccelerators, Hardware Partitioning for BigData Analytics by Lisa Wu et al. describes alow-area-overhead hardware accelerator thatsignicantly improves data partitioning per-formance for the important class of databaseworkloads.

In Efcient Spatial Processing ElementControl via Triggered Instructions, Angshu-man Parashar et al. target spatial acceleratorsand develop a novel approach to control owthat eliminates the performance problemsassociated with program counter-based con-trol ow used in prior spatial accelerators andarchitectures.

The article DeNovoND: Efcient Hard-ware for Disciplined Nondeterminism byHyojin Sung et al. proposes a design thatsimplies coherence implementation via dis-ciplined coding while still allowing key non-determinism features (which is critical forlock-based codes).

We hope that you enjoy reading thesearticles, as well as their original con-ference versions, and we welcome your feed-back on this issue. MICRO

AcknowledgmentsWe thank Erik Altman for his support.

We thank the web chairs Ahmed Abdel-Gawad, Timothy Pritchett, and Eric Villa-senor, who helped ensure a stable andglitch-free experience with the conferencesoftware.

Mithuna S. Thottethodi is an associateprofessor in the School of Electrical andComputer Engineering at Purdue Univer-sity. His research interests include parallelprogramming, parallel architecture, inter-connection networks, storage, and multicore

memory hierarchies. Thottethodi has a PhDin computer science from Duke University.He is a member of IEEE and the ACM.

Shubu Mukherjee is a distinguished engi-neer and the lead architect for the ARMv8processor core at Cavium. His researchinterests include innovation confluencingand computer architecture. Mukherjee has aPhD in computer science from the Univer-sity of Wisconsin-Madison. He is a Fellowof IEEE and the ACM.

Direct questions and comments about thisissue to Mithuna S. Thottethodi at [email protected] or to Shubu Mukherjee [email protected].

.............................................................

MAY/JUNE 2014 7


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


___________

_____________

_______

_____________________

____

___________________

................................................................................................................................................................................................................

DESIGNING AND MANAGINGDATACENTERS POWERED BY

RENEWABLE ENERGY................................................................................................................................................................................................................

ON-SITE RENEWABLE ENERGY HAS THE POTENTIAL TO REDUCE DATACENTERS CARBON

FOOTPRINT AND POWER AND ENERGY COSTS. THE AUTHORS BUILT PARASOL, A SOLAR-

POWERED DATACENTER, AND GREENSWITCH, A SYSTEM FOR SCHEDULING WORKLOADS,

TO EXPLORE THIS POTENTIAL IN A CONTROLLED RESEARCH SETTING.

......Datacenters range from a fewservers in a machine room to thousands ofservers housed in warehouse-size installa-tions.1 Estimates for 2010 indicate that, col-lectively, datacenters consume around 1.5percent of the total electricity used world-wide.1 This translates into high carbon emis-sions, as most of this electricity comes fromfossil fuels. A 2008 study estimated that data-centers emit 116 million metric tons of car-bon, slightly more than the entire country ofNigeria.2

With increasing societal demand forcleaner products and services, several compa-nies have announced plans to build greendatacentersthat is, datacenters partially orcompletely powered by renewables such assolar or wind energy. These datacenters willeither generate their own renewable energy(self-generation) or draw it directly from anexisting nearby plant (colocation). For exam-ple, Apple and McGraw-Hill have built largesolar arrays for their datacenters, whereasGreen House Data is a small cloud providerthat operates entirely on renewables. Al-though there are other approaches, theseexamples suggest that many datacenters that

seek to lower emissions will prefer colocationor self-generation. In our paper for the 18thInternational Conference on ArchitecturalSupport for Programming Languages andOperating Systems (ASPLOS 2013),3 we dis-cuss the current and expected future cost andspace needs of on-site solar and windgeneration.

Colocation and self-generation pose aninteresting research challenge: solar and windenergy are intermittent, which requiresapproaches for tackling the energy supplyvariability. One approach is to use batteriesand/or the electrical grid as a backup for therenewable energy. It might also be possible toadapt the workload (the energy demand) tomatch the renewable energy supply.4-8 Forthe highest benets, green datacenter opera-tors must intelligently manage their work-loads and the energy sources at their disposal.For example, when the workload is deferrable(that is, it can be delayed within a timebound), it might be appropriate to delaysome of the load and store the freed-uprenewable energy in the batteries for later use(for example, to shave an expected load peakwhen the renewable energy is not available).

I~nigo Goiri

William Katsak

Kien Le

Thu D. Nguyen

Ricardo Bianchini

Rutgers University

.......................................................



micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


As far as we know, green datacenter operatorsdo not currently manage their energy sourcesand workloads in this manner.

We set out to build software and hardwareto explore these issues. This article overviewstwo of our main efforts: Parasol andGreenSwitch.

ParasolFigure 1a shows Parasol, a solar-powered

datacenter that we built as a research plat-form to study colocation and self-generation.Parasol comprises a steel structure, a smallcustom container housing two racks of serv-ers and networking equipment, an air-sideeconomizer free-cooling unit and a direct-expansion air conditioner, 16 solar panels(producing up to 3.2 kW AC), two DC/ACinverters, 16 lead-acid batteries (storing upto 32 kWh), two charge controllers, and an

electricity grid tie. Parasol currently houses64 Atom-based servers (consuming at most30 W each), but it is large enough to house150 of them. It uses free cooling wheneveroutside temperatures and humidity are lowenough, and air conditioning otherwise. Par-asol can use solar energy directly, store it inits batteries, or feed it to the grid for credit(net metering). We thought about addinga wind turbine to Parasol, but historicalweather data shows that our location (Piscat-away, N.J.) is not windy enough.

Figure 1b shows Parasols power distribu-tion and monitoring infrastructure. BecauseParasol was built as a research instrument forstudying power management in green data-centers, it is critical that we understand thepower usage of each component, as well aspower losses. Thus, we have power meters(labeled M in the gure), either internal tocomponents (for example, the DC/AC

Inverter

Mainelectrical

panel

Batterycontroller

BatteriesGrid

electricalpanel

PDU

M

M

M

MMM

Airconditioner

Freecooling

DC

DCAC AC

AC

AC

AC AC IT

Electricalgrid

Solar panelsM

M

AC

(a)

(b)

Figure 1. Parasol: outside view showing the solar panels, container, and air conditioning unit

(a); power distribution and monitoring infrastructure (b). The cooling system can be powered

solely by the grid, or by the main electrical panel that receives power from all sources. Meters

(M) are available for measuring the power flowing into and out of every component.

.............................................................

MAY/JUNE 2014 9


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


inverters) or added on externally (for exam-ple, the cooling-system meter), for measuringthe power owing into and out of every com-ponent. Parasol also includes a switch thatallows for powering the cooling system fromthe main electrical panel or only from thegrid. This enables experimentation with orwithout the cooling system loading the solarsystem and batteries.

We describe our rationale for the Parasoldesign and the mistakes we made whilebuilding it over 16 months (at a total cost of$300,000) in our ASPLOS paper.3 In thisarticle, we report on data gathered from oper-ating Parasol over 22 months. Specically,solar generation and the IT equipmentbecame operational in April 2012, and Para-sol became fully operational in June 2012.

Energy production and usageFigure 2 shows energy usage, net-metered

energy, and the average inside and outsidetemperatures from April 2012 to January2014. We computed a power usage effective-ness (PUE) of 1.06 to 1.08, depending onthe computing load, owing to losses fromvarious conversions. April through June 2012show little or no grid energy consumption,because the external meters did not becomeoperational until the end of June 2012. Notethat total solar energy production is the

sum of solar energy consumed and solarenergy net metered. This data shows thatduring the summer months Parasol producesmore than 500 kWh every month, whereasduring the winter this production is reducedto less than half. For the year spanning July2012 through June 2013, we computed anaverage solar capacity factor of 16 percent.During this time, Parasol supported work-loads used for studying GreenSwitch and sixother research projects.

Interestingly, grid energy consumption inJuly 2012 was signicantly lower than inother months because we were experimentingwith GreenSwitch, transitioning machines tosleep, and using batteries (charged with solarenergy) to reduce brown energy consump-tion. Starting in November 2012, we raisedthe internal setpoint temperature from 27Cto 30C.

CoolingFigure 3 shows the operation of the cool-

ing system in Parasol during the second halfof August 2012. In this time period, the set-point for internal temperature was 30C; thedashed line shows the actual internal temper-ature, whereas the solid line shows the out-side temperature. The light gray area showsthe operation of the free-cooling unit,whereas the dark gray area shows the

Apr. 2012 July 2012 Oct. 2012 Jan. 2013 Apr. 2013 July 2013 Oct. 2013 Jan. 2014 0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

5

0

5

10

15

20

25

30

35

40

Net meterSolar useGrid use Inside Outside

Temp

erature (C)

Ene

rgy

(MW

h)

Figure 2. Energy consumption, net metering, and temperatures from April 2012 to January

2014. The figure shows the seasonal patterns for both renewable energy generation and

temperature.

..............................................................................................................................................................................................

TOP PICKS

............................................................

10 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


operation of the air conditioner. Note thateven though this time period is in thesummer, the air conditioner only ran duringtwo days, when the outside temperaturesexceeded 30C. Much of the time, the free-cooling unit ran below 25 percent fan speed.

The average PUE when including bothconversion losses and cooling overheads forParasol has been lower than 1.13, showingthat free cooling is very effective at keepingcooling overheads low. The air conditionerhas run for less than 20 days in a year, andless than 1 percent of the total time. Most ofthe time, our setpoint has been 30C, andthe typical temperatures inside Parasol (> 95percent) have ranged between 22C and30C. We have also been experimenting withnovel cooling policies and pushing the limitsof Parasol. During these experiments, theinternal temperature at the control sensor hasranged between 15C and 36C.

Thus far, we have replaced ve hard diskdrives, two solid-state drives, and one moth-erboard. Although this data is not statisticallysignicant, it is possible that our experimentshave decreased the reliability of the ITequipment.

Off-grid operation: Hurricane SandyIn late October 2012, the US East Coast

was hit by Hurricane Sandy. The stormreached Rutgers University on 29 October,and the grid power and network suffered out-ages for more than 20 hours. Figure 4 showsthe behavior of Parasol and the wind speed atour location from 28 October to 1 Novem-ber. Rutgers lost power on a Monday after-noon, at the height of the measured windspeed (> 70 km/h), and it did not come backuntil the afternoon of the next day. Duringthis time, Parasol used its batteries and solarenergy to operate normally (although we didtransition half of the machines to sleepbecause they were not being used). This expe-rience demonstrates the potential for greendatacenters to operate through power outages(or in remote locations without a reliable gridpower source).

GreenSwitchWe now discuss our research on managing

Parasol. Specically, we describe GreenSwitch,a system for scheduling workloads, selecting

which source of energy to use (renewable, bat-tery, and/or grid), and choosing the renewableenergy storage medium (battery or grid) ateach point in time. GreenSwitch seeks to min-imize the overall cost of grid electricity(including both grid energy and peak gridpower), while respecting the characteristics of

0

40

20

10

30

0

100

50

75

25

20 25 3015 16 17 18 19 21 22 23 24 26 27 28 29Te

mp

erat

ure

(C

)

Sp

eed (%

)

InsideOutsideAir conditioner

Free cooling

Figure 3. Cooling system operation from 15 August 2013 through 30 August

2013. The setpoint for internal temperature was 30C; the air conditioneronly ran during two days, when the outside temperature exceeded 30C.

20

15

10

5

0Sunday

28 Oct. 2012Monday

29 Oct. 2012Tuesday

30 Oct. 2012Wednesday31 Oct. 2012

0.0

3.0

2.0

1.0

0.5

1.5

2.5

Win

d s

pee

d (

m/s

)Po

wer

(kW

)

100

50

0

75

25

Battery charg

e level (%)

IT load

Battery dischargeBattery charge

Grid useSolar use

Battery charge level

Figure 4. Parasols operation during Hurricane Sandy. Parasol used its

batteries and solar energy to operate normally during a power outage of

more than 20 hours.

.............................................................

MAY/JUNE 2014 11


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


the workload and battery lifetime constraints.It can also manage workloads and energy sour-ces during grid outages.

ArchitectureFigure 5 illustrates the GreenSwitch archi-

tecture. The predictor forecasts the workloadand the renewable energy production oneday into the future at the granularity of onehour. The solver takes these predictions andthe current battery charge level as input, andoutputs a workload schedule and an energysource and storage schedule. To computethese schedules, the solver uses analyticalmodels of workload behavior, battery use,and grid electricity cost. The congurereffects the changes prescribed by the solver.The changes may involve transitioning someservers between power states and/or changingthe conguration of the energy sources. (Wehave identied conguration parameters tothe inverters and charge controllers that giveus nearly full dynamic control of every sourceof energy available to Parasol.)

A full iteration of GreenSwitch occursevery 15 minutes, which enables it to prop-erly control peak grid power use. (Utilitiestypically compute peak grid power use inwindows of 15 minutes.) However, Green-

Switch checks the production of solar energyevery 3 minutes. During each of these checks,GreenSwitch runs a full iteration if there hasbeen an unexpected change in production.

GreenSwitch evaluation on ParasolWe perform day-long experiments with

Parasol and an implementation of Green-Switch for the Hadoop MapReduce frame-work. We study two widely different Hadooptraces, called Facebook and Nutch. Theformer derives from a larger batch-job tracefrom Facebook,9 whereas the latter is theindexing part of a Web search system.10 Weinstantiate our models with the on-peak/off-peak grid energy prices and the peak gridpower charges at our location. We assume theutility pays the wholesale price of electricityfor net metering.

In the Facebook trace, jobs arrivethroughout the day.9 Figure 6 shows theGreenSwitch behavior when the jobs in thetrace are deferrable (each job can be delayedby up to 1 day), on 1 July 2012. The ll col-ors represent the use of the different energysources, whereas the lines are the solar energyproduction (full), the IT load (dots), the gridenergy price (dashes, y-axis on the right), andthe current peak grid power draw (dashes

Energyavailabilityprediction

Workloadprediction Solver

Energy sourceschedule

Workloadschedule

Configurer

Batterycharge level

Parasol

GreenSwitch

Predictor

Figure 5. GreenSwitch architecture. Rectangles with round edges are data structures. Rectangles with square borders are

processes.

..............................................................................................................................................................................................

TOP PICKS

............................................................

12 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


and dots). The white ll represents solarenergy that was produced but lost because ofinefciency. The gure shows that Green-Switch transitioned many servers to sleep inthe early hours of the day and deferred someof the load until solar energy was available.When there was no solar energy, Green-Switch drew energy from the batteries, sincethey stored enough capacity for the load thatwas not deferred. We also see that the solarenergy was enough to power the workload,charge the batteries, and feed energy to thegrid. Compared to a grid-only datacenter,GreenSwitch produced a prot of 9 percentin grid electricity cost. Given this prot,GreenSwitch would amortize the cost of thesolar setup and batteries in only 7.6 years.

Despite seeking primarily to minimizegrid electricity cost, GreenSwitch is also suc-cessful at reducing carbon footprints. Itachieves reductions in grid energy use be-tween 36 and 100 percent in our experimentswith Facebook and Nutch, compared to agrid-only datacenter.

Main lessons learnedWe have learned many important lessons

in building Parasol and GreenSwitch. First,we learned that engineering contractors areunfamiliar with the state-of-the-art in data-

center design or with research prototypes.Our inability to bridge this knowledge gapquickly (or at all) caused delays. This is achallenge for organizations that want to builddatacenters but lack the expertise.

Because Parasol was a major undertaking,its design needed to enable research on manytopics (such as solar energy, free cooling, andwimpy servers). However, because we hadnot yet started to research every topic, weended up designing more features and exi-bility into Parasol than we might eventuallyneed. This increased costs.

We also found that the need to collectne-grained power measurements and accu-rately estimate energy losses led to extradesign complexity. In addition, placing Para-sol on the roof of a building (instead of onthe ground) prevented shading from otherbuildings. Moreover, the cost of the roofplacement was roughly the same as that ofextending networking and power to groundlocations far enough away from buildings.

We learned that the wimpy fans in wimpyservers can generate nontrivial temperaturedifferences across a free-cooled datacenter.Finally, and most importantly, we learnedthat building a real prototype is critical forcompletely understanding green datacenters.For example, in designing GreenSwitch, we

0

0.5

1.0

1.5

2.0

2.5

3.0

00:00 04:00 08:00 12:00 16:00 20:00 00:000

0.05

0.10

0.15

0.20

Pow

er (

kW)

Price ($/kW

h)

Battery dischargeBattery charge

Grid useNet metering

Solar use

IT load

Solar available

Grid energy price

Peak grid power

Figure 6. GreenSwitch on deferrable Facebook workload. Most of the load during the night was delayed until renewable

energy became available. Batteries were used when no renewable energy was available.

.............................................................

MAY/JUNE 2014 13


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


detected instability in our charge controllerswhen switching power sources. As a result,GreenSwitch performs these switches in steps,with some idle time in between. Such effectswould have been overlooked in simulation.

Potential long-term impactWe expect Parasol and GreenSwitch to

have a lasting impact on both academia andindustry for several reasons.

Renewable energyAs we mentioned earlier, several compa-

nies are starting to invest in datacenter colo-cation and self-generation. Regardless ofwhether theyre making these investments formarket positioning, public relations, cost, orenvironmental reasons, the fact is that theyare expecting bottom-line benets fromthem. Moreover, despite their decreasing butstill-high capital costs, exploiting renewablesin datacenters could reduce overall energycosts, peak grid power costs, or both, as ourASPLOS paper explains. We expect that anincreasing number of companies will see ben-ets in exploiting renewables.

Some research groups have also startedstudying colocated and self-generating data-centers.4,5,7,11,12 These studies have beenattracting the attention of a growing com-munity, with publications in venues such asthe International Symposium on ComputerArchitecture (ISCA) and the InternationalConference on Architectural Support forProgramming Languages and Operating Sys-tems (ASPLOS). We expect that our designand experience with Parasol will acceleratethis growth, as researchers realize that theycan build nontrivial prototypes at relativelylow cost. Moreover, our analysis of solar andwind energy cost and space requirements sug-gests that green datacenters will becomeincreasingly attractive.3

More broadly than datacenters, our expe-rience will likely encourage more researchersto consider the implications of external sig-nals (such as variable-electricity pricing andavailability) on computing and communica-tion in general.

Green datacenter prototypeThere has been a dearth of real platforms

for the study of colocated and self-generating

green datacenters. Parasol addresses this needand is the rst platform of its kind. Priorstudies have had to resort to simulations orsmall implementations. In our ASPLOSpaper,3 we list instances in which such alter-natives would have hidden important effects.We mentioned instability issues earlier.Another example is that energy losses (forexample, in power conversion) are highlydependent on load, rather than a xed per-centage, as often assumed in simulation.These instances will encourage researchers tobuild prototypes for their studies. We expectthe Parasol design to serve as a model forthese future research prototypes. Moreover,Parasol enables research on various importanttopics, including solar energy and its impacton computing, energy storage and its abilityto lower costs, free cooling and its impact onreliability, wimpy servers and their perform-ance/energy trade-offs, and the developmentof distributed storage systems using solid-state drives. These topics are of interest toboth industry and academia.

In its current form, Parasol is a blueprintfor industry to build small-scale, low-densitygreen datacenters for enterprises and educa-tional institutions. Self-generating containersare cheaper and more practical to operate,and can be placed in less-valuable locationsthan in machine rooms inside existing build-ings. Parasol is also suitable for remotedeployments with poor or no access to elec-tricity (networking might need to take placeover satellite in this case).

Energy source and storage manager forgreen datacenters

GreenSwitch simultaneously managesworkload demand, multiple energy sources(renewable, battery, and grid), and multipleenergy stores (battery and grid). Our resultsshow that it is consistently effective at reducinggrid electricity costs and carbon footprints.

Although often overlooked in academia,simplicity and adaptability are key require-ments for practical adoption by industry. Wedesigned GreenSwitch to have both proper-ties. Specically, it uses simple models ofsolar energy availability, energy demand, andbattery behavior. In addition, although ourcurrent implementation targets Hadoop,GreenSwitch is modular in that only one

..............................................................................................................................................................................................

TOP PICKS

............................................................

14 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


component (the congurer) is specic to theunderlying computing framework.

Research avenuesParasol and GreenSwitch create many new

research avenues. For example, Parasol ena-bles the study of the interplay between solarenergy and free cooling; interestingly, solarenergy is most abundant when the outsidetemperature is hottest (that is, when standardchiller-based cooling might be necessary inwarm climates). As another example, Green-Switch demonstrates the benets of aggressiveand coordinated management of energy sour-ces and stores and workload execution, as wellas the interplay between using batteries forpowering the workload and for storing renew-able energy. Prior work on aggressive use ofbatteries did not consider renewables.13

D atacenters that are partially poweredby renewable energy represent anincreasingly interesting research topic frommany perspectives. In this article, we havedescribed Parasol, a solar-powered datacenterthat we have built as a research platform, andour experience in constructing and operatingParasol. We have also described GreenSwitch,a workload and power source managementsystem. As we mentioned earlier, Parasol andGreenSwitch enable the exploration of manyresearch avenues. We are currently studyingthe behavior and management of free-cooleddatacenters, as well as the interaction betweensolar energy and free cooling. We are alsostudying the design of green energy-awarelatency-sensitive applications, such as cloud-based distributed storage systems. Speci-cally, we are exploring how to design systemsthat can maintain service-level objectives (forexample, a desired 99th percentile responsetime), while maximizing usage of renewableenergy and minimizing usage of brownenergy. In conclusion, we hope that our expe-rience with Parasol and GreenSwitch willentice other researchers and practitioners toconsider these datacenters. MICRO

AcknowledgmentsWe thank Abhishek Bhattacharjee, David

Meisner, Santosh Nagarakatte, Anand Sivasu-bramaniam, and Thomas F.Wenisch for com-

ments that helped us improve this article. Weare also grateful to our sponsors, NSF grantCSR-1117368, and the Rutgers Green Com-puting Initiative. Finally, we are indebted toJoan Stanton, Heidi Szymanski, Jon Tenen-baum, Chuck Depasquale, SMA America,andMichael J. Pazzani for their extensive helpin building and funding Parasol.

....................................................................References1. J. Koomey, Growth in Data Center Electric-

ity Use 2005 to 2010, Analytic Press, 2011.

2. J. Mankoff, R. Kravets, and E. Blevis,

Some Computer Science Issues in Creat-

ing a Sustainable World, Computer, vol.

41, no. 8, 2008, pp. 102-105.

3. I. Goiri et al., Parasol and GreenSwitch:

Managing Datacenters Powered by Renew-

able Energy, Proc. 18th Intl Conf. Architec-

tural Support for Programming Languages

and Operating Systems (ASPLOS 13), 2013,

pp. 51-64.

4. B. Aksanli et al., Utilizing Green Energy

Prediction to Schedule Mixed Batch and

Service Jobs in Data Centers, Proc. 4th

Workshop Power-Aware Computing and

Systems (HotPower 11), 2011, article no. 5.

5. I. Goiri et al., GreenSlot: Scheduling Energy

Consumption in Green Datacenters, Proc.

Intl Conf. High Performance Computing,

Networking, Storage and Analysis (SC 11),

2011, article no. 20.

6. I. Goiri et al., GreenHadoop: Leveraging

Green Energy in Data-Processing Frame-

works, Proc. 7th ACM European Conf.

Computer Systems (EuroSys 12), 2012,

pp. 57-70.

7. A. Krioukov et al., Integrating Renewable

Energy Using Data Analytics Systems: Chal-

lenges and Opportunities, Data Eng. Bulle-

tin, vol. 34, no. 1, 2011, pp. 3-11.

8. Z. Liu et al., Renewable and Cooling Aware

Workload Management for Sustainable

Data Centers, Proc. 12th ACM SIGMET-

RICS/PERFORMANCE Joint Intl Conf.

Measurement and Modeling of Computer

Systems, 2012, pp. 175-186.

9. Y. Chen et al., The Case for Evaluating

MapReduce Performance Using Workload

Suites, Proc. Modeling, Analysis &.............................................................

MAY/JUNE 2014 15


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


Simulation of Computer and Telecommunica-

tion Systems (MASCOTS), 2011, pp. 390-

399.

10. EPFL, CloudSuite, 2012; http://parsa.epfl.

ch/cloudsuite/cloudsuite.html.

11. C. Li, A. Qouneh, and T. Li, iSwitch: Coordi-

nating and Optimizing Renewable Energy

Powered Server Clusters, Proc. 39th Ann.

Intl Symp. Computer Architecture (ISCA

12), 2012, pp. 512-523.

12. N. Sharma et al., Blink: Managing Server

Clusters on Intermittent Power, Proc. 16th

Intl Conf. Architectural Support for Pro-

gramming Languages and Operating Sys-

tems (ASPLOS 11), 2011, pp. 185-198.

13. S. Govindan et al., Leveraging Stored

Energy for Handling Power Emergencies in

Aggressively Provisioned Datacenters, Proc.

17th Intl Conf. Architectural Support for Pro-

gramming Languages and Operating Sys-

tems (ASPLOS 12), 2012, pp. 75-86.

I~nigo Goiri is a research associate in theDepartment of Computer Science at RutgersUniversity. His research interests includeenergy-efficient datacenter design and virtuali-

zation. Goiri has a PhD in computer sciencefrom the Universitat Politecnica de Catalunya.

William Katsak is a PhD student in theDepartment of Computer Science at RutgersUniversity. His research focuses on powermanagement of datacenters. Katsak has anMS in computer science from Rutgers Uni-versity. He is a student member of IEEE andthe ACM.

Kien Le is a software engineer at A10 net-works. His research focuses on building acost-aware load distribution framework toreduce energy consumption and promoterenewable energy. Le has a PhD in computerscience from Rutgers University, where hecompleted the work for this article.

Thu D. Nguyen is an associate professor inthe Department of Computer Science atRutgers University. His research interestsinclude green computing, distributed andparallel systems, operating systems, andinformation retrieval. Nguyen has a PhD incomputer science and engineering from theUniversity of Washington. He is a memberof IEEE and the ACM.

Ricardo Bianchini is a professor in theDepartment of Computer Science at RutgersUniversity. He is currently on leave fromRutgers and working as the chief efficiencystrategist at Microsoft. His research interestsinclude the power, energy, and thermal man-agement of servers and datacenters. Bianchinihas a PhD in computer science from the Uni-versity of Rochester. He is an ACM distin-guished scientist and a senior member ofIEEE.

Direct questions and comments about thisarticle to I~nigo Goiri, Department of Com-puter Science, Rutgers University, 110 Fre-linghuysen Road, Piscataway, NJ 08854-8019; [email protected].

..............................................................................................................................................................................................

TOP PICKS

............................................................

16 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


_________________

________________

_____________

____________

_______

..................................................................................................................................................................................................................

QUALITY-OF-SERVICE-AWARESCHEDULING IN HETEROGENEOUSDATACENTERS WITH PARAGON

..................................................................................................................................................................................................................

PARAGON, AN ONLINE, SCALABLE DATACENTER SCHEDULER, ENABLES BETTER CLUSTER

UTILIZATION AND PER-APPLICATION QUALITY-OF-SERVICE GUARANTEES BY LEVERAGING

DATA MINING TECHNIQUES THAT FIND SIMILARITIES BETWEEN KNOWN AND NEW

APPLICATIONS. FOR A 2,500-WORKLOAD SCENARIO, PARAGON PRESERVES PERFORMANCE

CONSTRAINTS FOR 91 PERCENT OF APPLICATIONS, WHILE SIGNIFICANTLY IMPROVING

UTILIZATION. IN COMPARISON, A BASELINE LEAST-LOADED SCHEDULER ONLY PROVIDES

SIMILAR GUARANTEES FOR 3 PERCENT OF WORKLOADS.

......Efciency is a rst-class require-ment and the main source of scalability con-cerns both for small and large systems.1,2

Achieving high efciency is not only a matterof sensible design, but also a function of howthe system is managed, which becomes essen-tial as the hardware grows progressively heter-ogeneous and parallel and applications getdynamic and diverse. Architecture has tradi-tionally been about efcient system design.As efciency increases in importance, archi-tecture should be about both design andmanagement for systems of any scale.

In this article, we focus on improving ef-ciency while guaranteeing high performancein large-scale systems. Although an increasingamount of computing now happens in publicand private clouds, such as Amazon ElasticCompute Cloud (EC2; see http://aws.amazon.com/ec2) or vSphere (www.vmware.

com/products/vsphere), datacenters continueto operate at utilizations in the single dig-its.1,3 This lessens the two main advantagesof cloud computingexibility and cost ef-ciency both for cloud operators and endusersbecause not only are the machinesunderutilized, they are also operating in anon-energy-proportional region.1,4

There can be several reasons why ma-chines are underutilized. Two of the mostprominent obstacles are interference betweencoscheduled applications and heterogeneityin server platforms. For more information,see the Interference and Heterogeneitysidebar.

In our paper presented at the 18th Inter-national Conference on Architectural Sup-port for Programming Languages andOperating Systems (ASPLOS 2013),5 weintroduced Paragon, an online and scalable

Christina Delimitrou

Christos Kozyrakis

Stanford University

0272-1732/14/$31.00c 2014 IEEE Published by the IEEE Computer Society.............................................................

17


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


..............................................................................................................................................................................................

Interference and HeterogeneityInterference occurs as coscheduled applications contend in shared

resources. Coscheduled applications may interfere negatively even if

they run on different processor cores because they share caches,

memory channels, storage, and networking devices.1,2 If unmanaged,

interference can result in performance degradations of integer fac-

tors,2 especially when the application must meet tail latency guaran-

tees apart from average performance.3 Figure A shows that an

interference-oblivious scheduler will slow workloads down by 34 per-

cent on average, with some running more than two times slower. This

is undesirable for both users and operators.

Heterogeneity is the natural result of the infrastructures evolu-

tion, as servers are gradually provisioned and replaced over the typical

15-year lifetime of a datacenter.4-7 At any point in time, a datacenter

may host three to five server generations with a few hardware config-

urations per generation, in terms of the processor speed, memory,

storage, and networking subsystems. Managing the different hard-

ware incorrectly not only causes significant performance degradations

to applications sensitive to server configuration, but also wastes

resources as workloads occupy servers for significantly longer, and

gives a low-quality signal to hardware vendors for the design of future

platforms. Figure A shows that a heterogeneity-oblivious scheduler

will slow applications down by 22 percent on average, with some run-

ning nearly 2 times slower (see the Methodology section in the

main article).

Finally, a baseline scheduler that is oblivious to both interference

and heterogeneity and which schedules applications to least-loaded

servers is even worse (48 percent average slowdown), causing some

workloads to crash due to resource exhaustion on the server. Unless

interference and heterogeneity are managed in a coordinated fashion,

the system loses both its efficiency and predictability guarantees. Pre-

vious research has identified the issues of heterogeneity6 and inter-

ference,2 but while most cloud management systemssuch as

Mesos8 or vSphere (www.vmware.com/products/vsphere)have

some notion of contention or interference awareness, they either use

empirical rules for interference management or assume long-running

workloads (for example, online services), whose repeated behavior

can be progressively modeled. In this article, we target both heteroge-

neity and interference and assume no a priori analysis of the applica-

tion. Instead, we leverage information the system already has about

the large number of applications it has previously seen.

References1. S. Govindan et al., Cuanta: Quantifying Effects of Shared

On-Chip Resource Interference for Consolidated Virtual

Machines, Proc. 2nd ACM Symp. Cloud Computing, 2011,

article no. 22.

2. J. Mars et al., Bubble-Up: Increasing Utilization in Modern

Warehouse Scale Computers via Sensible Co-locations,

Proc. 44th Ann. IEEE/ACM Intl Symp. Microarchitecture,

2011, pp. 248-259.

3. D. Meisner et al., Power Management of Online Data-Inten-

sive Services, Proc. 38th Ann. Intl Symp. Computer Archi-

tecture (ISCA 11), 2011, pp. 319-330.

4. L.A. Barroso and U. Holzle, The Datacenter as a Computer:

An Introduction to the Design of Warehouse-Scale

Machines, Morgan and Claypool Publishers, 2009.

5. C. Kozyrakis et al., Server Engineering Insights for Large-Scale

Online Services, IEEEMicro, vol. 30, no. 4, 2010, pp. 8-19.

6. J. Mars, L. Tang, and R. Hundt, Heterogeneity in Homoge-

neous Warehouse-Scale Computers: A Performance Oppor-

tunity, IEEE Computer Architecture Letters, vol. 10, no. 2,

2011, pp. 29-32.

7. R. Nathuji, C. Isci, and E. Gorbatov, Exploiting Platform Het-

erogeneity for Power Efficient Data Centers, Proc. 4th Intl

Conf. Autonomic Computing (ICAC 07), 2007, doi:10.1109/

ICAC.2007.16.

8. B. Hindman et al., Mesos: A Platform for Fine-Grained

Resource Sharing in the Data Center, Proc. 8th USENIX

Conf. Networked Systems Design and Implementation,


1.0

Alone on best platform No interferenceLeast loadedNo heterogeneity

Sp

eed

up o

ver

alon

e on

bes

t pla

tform

0.8

0.6

0.4

0.2

0.00 1,000 2,000

Workloads3,000 4,000 5,000

Figure A. Performance degradation for 5,000 applications

on 1,000 Amazon Elastic Compute Cloud (EC2) servers with

heterogeneity-oblivious, interference-oblivious, and

baseline least-loaded schedulers compared to ideal

scheduling (application runs alone on best platform).

Results are ordered fromworst- to best-performing

workload.

..............................................................................................................................................................................................

TOP PICKS

............................................................

18 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


____________________

datacenter scheduler that accounts for hetero-geneity and interference. The key feature ofParagon is its ability to quickly and accuratelyclassify an unknown application with respectto heterogeneity (which server congurationsit will perform best on) and interference(how much interference it will cause tocoscheduled applications and how muchinterference it can tolerate itself in multipleshared resources). Unlike previous techniquesthat require detailed proling of each in-coming application, Paragons classicationengine exploits existing data from previouslyscheduled workloads and requires only aminimal signal about a new workload. Spe-cically, it is organized as a low-overhead rec-ommendation system similar to the onedeployed for the Netix Challenge,6 butinstead of discovering similarities in usersmovie preferences, it nds similarities inapplications preferences with respect to het-erogeneity and interference. It uses singularvalue decomposition (SVD) to perform col-laborative ltering and identify similaritiesbetween incoming and previously scheduledworkloads.

Once an incoming application is classi-ed, a greedy scheduler assigns it to the serverthat is the best possible match in terms ofplatform and minimum negative interferencebetween all coscheduled workloads. Eventhough the nal step is greedy, the high accu-racy of classication leads to schedules thatachieve both fast execution time and efcientresource usage. Paragon scales to systemswith tens of thousands of servers and tens ofcongurations, running large numbers ofpreviously unknown workloads. We imple-mented Paragon and showed that it signi-cantly improves cluster utilization, whilepreserving per-application quality-of-service(QoS) guarantees both for small- and large-scale systems. For more information onrelated work, see the Research Related toParagon sidebar.

Fast and accurate classificationThe key requirement for heterogeneity

and interference-aware scheduling is toquickly and accurately classify incomingapplications. First, we need to know how fastan application will run on each of the tens of

server congurations (SCs) available. Second,we need to know how much interference itcan tolerate from other workloads in each ofseveral shared resources without signicantperformance loss and how much interferenceit will generate itself. Our goal is to performonline scheduling for large-scale systemswithout any a priori knowledge about incom-ing applications. Most previous schemesaddress this issue with detailed but ofineapplication characterization or long-termmonitoring and modeling.7-9 Paragon takes adifferent approach. Its core idea is that,instead of learning each new workload indetail, the system leverages information italready has about applications it has seen toexpress the new workload as a combinationof known applications. For this purpose, weuse collaborative ltering techniques thatcombine a minimal proling signal about thenew application with the large amount ofdata available from previously scheduledworkloads. The result is fast and accurateclassication of incoming applications withrespect to heterogeneity and interference.Within a minute of its arrival, an incomingworkload is scheduled on a large-scale cluster.

Background on collaborative filteringCollaborative ltering techniques are fre-

quently used in recommendation systems.We use one of their most publicized applica-tions, the Netix Challenge,6 to provide aquick overview of the two analytical methodswe rely on, SVD and PQ reconstruction.10

In this case, the goal is to provide valid movierecommendations for Netix users given theratings they have provided for various othermovies.

The input to the analytical framework is asparse matrix A, the utility matrix, with onerow per user and one column per movie. Theelements of A are the ratings that users haveassigned to movies. Each user has rated onlya small subset of movies; this is especially truefor new users, who might only have a handfulof ratings, or even none. Although techniquesexist that address the cold-start problem (thatis, providing recommendations to a com-pletely fresh user with no ratings), we focushere on users for whom the system has someminimal input. If we can estimate the valuesof the missing ratings in the sparse matrix A,

.............................................................

MAY/JUNE 2014 19


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


we can make movie recommendations; thatis, we can suggest that users watch the moviesfor which the recommendation system esti-mates they will give high ratings to with highcondence.

The rst step is to apply SVD, a matrixfactorization method used for dimensionalityreduction and similarity identication. Fac-toring A produces the decomposition to thefollowing matrices of left (U) and right (V)

..............................................................................................................................................................................................

Research Related to ParagonWe discuss work relevant to Paragon in the areas of datacenter

scheduling, virtual machine (VM) management, workload rightsizing,

and scheduling for heterogeneous multicore chips.

Datacenter schedulingRecent work on datacenter scheduling has highlighted the impor-

tance of platform heterogeneity and workload interference. Mars

et al. showed that the performance of Google workloads can vary by

up to 40 percent because of heterogeneity, even when considering

only two server configurations, and by up to 2 times because of inter-

ference, even when considering only two colocated applications.1,2

Govindan et al. also present a scheme to quantify the effects of cache

interference between consolidated workloads.3 In Paragon, we extend

the concepts of heterogeneity- and interference-aware scheduling by

providing an online, scalable, and low-overhead methodology that

accurately classifies applications for both heterogeneity and interfer-

ence across multiple resources.

VM managementSystems such as vSphere (http://www.vmware.com/products/

vsphere) or the VM platforms on public cloud providers can schedule

diverse workloads submitted by users on the available servers. In gen-

eral, these platforms account for application resource requirements

that they expect the user to express or they learn over time by moni-

toring workload execution. Paragon can complement such systems by

making scheduling decisions on the basis of heterogeneity and inter-

ference and detecting when an application should be considered for

rescheduling.

Resource management and rightsizingThere has been significant work on resource allocation in virtual-

ized and nonvirtualized large-scale datacenters. Mesos performs

resource allocation between distributed computing frameworks such

as Hadoop or Spark.4 Rightscale (http://www.rightscale.com) auto-

matically scales out three-tier applications to react to changes in the

load in Amazons cloud service. DejaVu serves a similar goal by identi-

fying a few workload classes and, based on them, reusing previous

resource allocations to minimize reallocation overheads.5 In general,

Paragon is complementary to rightsizing systems. Once such a system

determines the amount of resources needed by an application, Para-

gon can classify and schedule it on the proper hardware platform in a

way that minimizes interference.

Scheduling for heterogeneous multicore chipsScheduling in heterogeneous CMPs shares some concepts and

challenges with scheduling in heterogeneous datacenters; thus, some

of the ideas in Paragon can be applied in heterogeneous CMP sched-

uling as well. Shelepov et al. present a scheduler for heterogeneous

CMPs that is simple and scalable,6 whereas Craeynest et al. use per-

formance statistics to estimate which workload-to-core mapping is

likely to provide the best performance.7 Given the increasing number

of cores per chip and coscheduled tasks, techniques similar to the

ones used in Paragon can be applicable when deciding how to sched-

ule applications in heterogeneous CMPs as well.

References1. J. Mars, L. Tang, and R. Hundt, Heterogeneity in Homoge-

neous Warehouse-Scale Computers: A Performance Oppor-

tunity, IEEE Computer Architecture Letters, vol. 10, no. 2,

2011, pp. 29-32.

2. J. Mars et al., Bubble-Up: Increasing Utilization in Modern

Warehouse Scale Computers via Sensible Co-locations,

Proc. 44th Ann. IEEE/ACM Intl Symp. Microarchitecture,

2011, pp. 248-259.

3. S. Govindan et al., Cuanta: Quantifying Effects of Shared

On-Chip Resource Interference for Consolidated Virtual

Machines, Proc. 2nd ACM Symp. Cloud Computing, 2011,

article no. 22.

4. B. Hindman et al., Mesos: A Platform for Fine-Grained

Resource Sharing in the Data Center, Proc. 8th USENIX

Conf. Networked Systems Design and Implementation,


5. N. Vasic et al., DejaVu: Accelerating Resource Allocation in

Virtualized Environments, Proc. 17th Intl Conf. Architec-

tural Support for Programming Languages and Operating

Systems, 2012, pp. 423-436.

6. D. Shelepov et al., HASS: A Scheduler for Heterogeneous

Multicore Systems, ACM SIGOPS Operating Systems

Rev., vol. 43, no. 2, 2009, pp. 66-75.

7. K. Craeynest et al., Scheduling Heterogeneous Multi-Cores

through Performance Impact Estimation (PIE), Proc. 39th

Ann. Intl Symp. Computer Architecture (ISCA 12), 2012,

pp. 213-224.

..............................................................................................................................................................................................

TOP PICKS

............................................................

20 IEEE MICRO


micro


micro

qqMM

qqMM

qM


qqMM

qqMM

qM


____

singular vectors and the diagonal matrix ofsingular values (R):

Am;n

a1;1 a1;2 a1;na2;1 a2;2 a2;n... ..

. . .. ..

.

am;1 am;2 am;n

0BBBB@

1CCCCA

U R V Twhere

Umr u1;1 u1;r... . .

. ...

um;1 um;r

0BB@

1CCA;

V nr v1;1 v1;r... . .

. ...

vn;1 vn;r

0BB@

1CCA;

Rrr r1 0... . .

. ...

0 rr

0BB@

1CCA

Dimension r is the rank of matrix A, andit represents the number of similarity con-cepts identied by SVD. For instance, onesimilarity concept might be that certain mov-ies belong to the drama category, whileanother might be that most users who likedthe movie The Lord of the Rings: The Fellow-ship of the Ring also liked The Lord of theRings: The Two Towers. Similarity conceptsare represented by singular values ri inmatrix R and the condence in a similarityconcept by the magnitude of the correspond-ing singular value. Singular values in R areordered by decreasing magnitude. Matrix Ucaptures the strength of the correlationbetween a row of A and a similarity concept.In other words, it expresses how users relateto similarity concepts such as the one aboutliking drama movies. Matrix V captures thestrength of the correlation of a column of Ato a similarity concept. In other words, towhat extent does a mo

micro computer architecture

Documents

ieee headquarters

accepted version of

ieee copyright notice

published version

copyrighted material

paidat new york

search issue

reuse rights