micro computer architecture

156
The magazine for chip and silicon systems designers The Academic and Business Marriage p. 152 http://www.computer.org/micro May/June 2014 Contents | Zoom in | Zoom out Search Issue | Next Page For navigation instructions please click here Contents | Zoom in | Zoom out Search Issue | Next Page For navigation instructions please click here

Upload: estepr1215646

Post on 06-Nov-2015

77 views

Category:

Documents


21 download

DESCRIPTION

Magazine, electronics

TRANSCRIPT

  • The magazine for chip and silicon systems designers

    The Academic and Business Marriagep. 152

    http://www.computer.org/micro

    May/June 2014

    Contents | Zoom in | Zoom out Search Issue | Next PageFor navigation instructions please click here

    Contents | Zoom in | Zoom out Search Issue | Next PageFor navigation instructions please click here

  • IEEE Micro (ISSN 0272-1732) is published bimonthly by the IEEE Computer Society.IEEE Headquarters, Three Park Ave., 17th Floor, New York, NY 10016-5997; IEEEComputer Society Headquarters, 2001 L St., Ste. 700, Washington, DC 20036; IEEEComputer Society Publications Office, 10662 Los Vaqueros Circle, PO Box 3014,Los Alamitos, CA 90720. Annual subscription rates: IEEE Computer Society membersget the lowest rates, US$45 (print and electronic). Go to http://www.computer.org/subscribe to order and for more information on other subscription prices. Back issues:members, $20; nonmembers, $148. This magazine is also available on the Web.Postmaster: Send address changes and undelivered copies to IEEE, MembershipProcessing Dept., 445 Hoes Ln., Piscataway, NJ 08855. Periodicals postage is paidat New York, NY, and at additional mailing offices. Canadian GST #125634188.Canada Post Corp. (Canadian distribution) Publications Mail Agreement #40013885.Return undeliverable Canadian addresses to 4960-2 Walker Road; Windsor, ON N9A6J3. Printed in USA.Reuse rights and reprint permissions: Educational or personal use of this material ispermitted without fee, provided suchuse: 1) is notmade for profit; 2) includes this noticeand a full citation to the original work on the first page of the copy; and 3) does not implyIEEE endorsement of any third-party products or services. Authors and their companiesare permitted to post the accepted version of IEEE-copyrighted material on their ownweb servers without permission, provided that the IEEE copyright notice and a fullcitation to the original work appear on the first screen of the posted copy. An acceptedmanuscript is a version which has been revised by the author to incorporate reviewsuggestions, but not the published version with copy-editing, proofreading, and for-matting added by IEEE. For more information, please go to http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html.Permission to reprint/republish this material for commercial, advertising, or promo-tional purposes or for creating new collective works for resale or redistribution must beobtained from IEEE by writing to the IEEE Intellectual Property Rights Office,445 Hoes Lane, Piscataway, NJ 08854-4141 or [email protected] # 2014 IEEE. All rights reserved.Abstracting and library use: Abstracting is permitted with credit to the source.Libraries are permitted to photocopy for private use of patrons, provided theper-copy fee indicated in the code at the bottom of the first page is paid throughthe Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.Editorial: Unless otherwise stated, bylined articles, as well as product and service descrip-tions, reflect the authors or firms opinion. Inclusion in IEEE Micro does not necessarilyconstitute an endorsement by IEEE or the Computer Society. All submissions are subject toediting for style, clarity, and space. IEEE prohibits discrimination, harassment, and bullying.For more information, visit http://www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

    May/June 2014 Volume 34 Number 3

    Features

    4 Guest Editors Introduction: Top Picks from the 2013 ComputerArchitecture ConferencesMithuna S. Thottethodi and Shubu Mukherjee

    8 Designing and Managing Datacenters Powered byRenewable EnergyI ~nigo Goiri, William Katsak, Kien Le, Thu D. Nguyen, andRicardo Bianchini

    17 Quality-of-Service-Aware Scheduling in HeterogeneousDatacenters with ParagonChristina Delimitrou and Christos Kozyrakis

    31 A Case for Specialized Processors for Scale-Out WorkloadsMichael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos,Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian DanielPopescu, Anastasia Ailamaki, and Babak Falsafi

    43 Smart: Single-Cycle Multihop Traversals over a SharedNetwork on ChipTusharKrishna,Chia-HsinOwenChen,Woo-CheolKwon, andLi-ShiuanPeh

    57 Networks on Chip with Provable Security PropertiesHassan M.G. Wassel, Ying Gao, Jason K. Oberg, Ted Huffmire,Ryan Kastner, Frederic T. Chong, and Timothy Sherwood

    69 Cache Coherence for GPU ArchitecturesInderpreet Singh, Arrvindh Shriraman, Wilson W.L. Fung, Mike OConnor,and Tor M. Aamodt

    80 A Configurable and Strong RAS Solution for Die-StackedDRAM CachesJaewoong Sim, Gabriel H. Loh, Vilas Sridharan, and Mike OConnor

    91 Decoupled Compressed Cache: Exploiting Spatial Locality forEnergy OptimizationSomayeh Sardashti and David A. Wood

    100 Sonic Millip3De: An Architecture for Handheld 3D UltrasoundRichard Sampson, Ming Yang, Siyuan Wei, Chaitali Chakrabarti, andThomas F. Wenisch

    109 Hardware Partitioning for Big Data AnalyticsLisa Wu, Raymond J. Barker, Martha A. Kim, and Kenneth A. Ross

    120 Efficient Spatial Processing Element Controlvia Triggered InstructionsAngshuman Parashar, Michael Pellauer, Michael Adler, Bushra Ahsan, NealCrago, Daniel Lustig, Vladimir Pavlov, Antonia Zhai, Mohit Gambhir,Aamer Jaleel, Randy Allmon, Rachid Rayess, Stephen Maresh, and Joel Emer

    138 DeNovoND: Efficient Hardware forDisciplined NondeterminismHyojin Sung, Rakesh Komuravelli, and Sarita V. Adve

    Departments

    2 From the Editor in ChiefTop Picks from 2013

    149 AwardsReflections from the 2013 Eckert-Mauchly Award Recipient

    152 Micro EconomicsThe Academic and Business Marriage

    Cover artwork by GiacomoMarchesiwww.GiacomoMarchesi.com

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ________________

    _________________________

    _________________________

    ___

    _______

    __________

    _____________________

  • EDITOR IN CHIEF

    Erik R. AltmanThomas J. Watson Research [email protected]

    ASSOCIATE EDITOR IN CHIEF

    Lieven EeckhoutGhent [email protected]

    ADVISORY BOARD

    David H. Albonesi, Pradip Bose, Kemal Ebcioglu,Michael Flynn, Ruby B. Lee, Yale Patt, James E.Smith, and Marc Tremblay

    EDITORIAL BOARD

    Alper BuyuktosunogluIBM

    Pradeep DubeyIntel Corp.

    Sandhya DwarkadasUniversity of Rochester

    Babak FalsafiEcole Polytechnique Federale de Lausanne

    Krisztian FlautnerARM

    R. GovindarajanIndian Institute of Science

    Shane GreensteinNorthwestern University

    Lizy Kurian JohnUniversity of Texas at Austin

    Stephen W. KecklerUniversity of Texas at Austin

    Margaret MartonosiPrinceton University

    Richard MateosianShubu MukherjeeCavium Networks

    Toshio NakataniIBM

    Vojin G. OklobdzijaNew Mexico State University

    Ronny RonenIntel Corp.

    Kevin W. RuddUS Naval Academy

    Andre SeznecINRIA Rennes

    Richard H. SternOlivier TemamINRIA

    Mateo ValeroTechnical University of Catalonia

    Tilman WolfUniversity of Massachusetts, Amherst

    Xiaodong ZhangOhio State University

    EDITORIAL STAFFEditorial Management

    Molly Gamborg

    Contributing Editors

    Amber Ankerholz, Thomas Centrella,

    Kristine Kelly, Keri Schreiner,

    Dale Strok, and Joan Taylor

    Director, Products & Services

    Evan Butterfield

    Senior Manager, Editorial Services

    Robin Baldwin

    Associate Manager, Peer Review & PeriodicalAdministration

    Hilda Carman

    Senior Business Development Manager

    Sandra Brown

    Senior Advertising Coordinator

    Marian Anderson

    EDITORIAL OFFICE

    PO Box 3014, Los Alamitos, CA 90720;

    (714) 821-8380; [email protected]

    Submissions:

    https://mc.manuscriptcentral.com/micro-cs

    Author guidelines:

    http://www.computer.org/micro

    IEEE COMPUTER SOCIETY

    PUBLICATIONS BOARD

    Vice President

    Jean-Luc Gaudiot

    Magazine Operations Chair

    Paolo Montuschi

    Transactions Operations Committee

    Laxmi N. Bhuyan

    Digital Library Operations Committee

    Frank Ferrante

    Plagiarism Chair

    David S. Ebert

    Executive Director

    Angela R. Burgess

    Members-at-Large

    Alain April, Greg Byrd, Robert Dupuis,

    Linda I. Shafer, H.J. Siegel, and Per Stenstrom

    COMPUTER SOCIETY MAGAZINE

    OPERATIONS COMMITTEE

    Paolo Montuschi (Chair)

    Erik R. Altman, Maria Ebling, Miguel Encarnacao,

    Lars Heide, Cecilia Metra, San Murugesan, Shari

    Lawrence Pfleeger, Michael Rabinovich, Yong Rui,

    Forrest Shull, George K. Thiruvathukal, Ron Vetter,

    and Daniel Zeng

    MAY/JUNE 2014 1

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ____________

    _____________________

    __________

    _________

    _________

    ___________

    ___________

    _____________

  • ................................................................................................................................................................

    Top Picks from 2013

    ERIK R. ALTMANThomas J. Watson Research Center

    ......This double issue features ourannual Top Picks from the microarchitec-

    ture conferences held in 2013. I thank

    Guest Editors Mithuna S. Thottethodi

    and Shubu Mukherjee for their outstand-

    ing job in all aspects of running the Pro-

    gram Committee and arriving at these

    selections. I am also happy to report that

    we received a record 101 submissions,

    fromwhich 12 were selected for publica-

    tion here.

    Like last year, it seemed an inter-

    esting exercise to compare the topics

    of 2013 Top Picks articles with topics

    covered in the inaugural 2003 Top

    Picks issue. In 2003, Guest Editors

    Charles Moore, Kevin W. Rudd, Ruby

    B. Lee, and Pradip Bose divided

    articles into six categories. I have

    assigned this years articles to those

    same six categories, as shown in

    Table 1. In doing so, only one article

    did not seem a good t to any of the

    2003 categories. That article focuses

    on datacenters, and in 2003 there was

    no datacenter or cloud computing

    category. (Other articles this year

    also touch on datacenters, but have

    aspects that t within 2003

    categories.)

    The inability to continue Dennard scal-

    ing has yielded a major increase in articles

    in the Unconventional architectures

    category, whereas Building on con-

    ventional microarchitectures dropped

    to zero articles, as did Performance ana-

    lysis, with other categories staying

    roughly similar.

    It is sometimes a point of confusion

    about how the Top Picks articles pub-

    lished here differ from the original con-

    ference publications. Like all IEEE

    publications, IEEE Micro requires at

    least 30 percent new content over any

    previous publication. Top Picks articles

    generally meet this requirement via a

    three-page summary (in the initial sub-

    mission), summarizing the paper and

    arguing for the potential of the work to

    have long-term impact. (Indeed, for the

    upcoming Top Picks to be published in

    2015, Program Committee Chairs and

    Guest Editors Luis Ceze and Karin

    Strauss ask what the citation of your

    paper would be if it won the test of

    time award in 10 years.) In addition,

    IEEE Micro has a 5,000-word limit, so

    authors often have to condense their

    original paper. As a result, the IEEE

    Micro version of Top Picks papers gen-

    erally provides more context and a

    slightly higher-level overview of the

    work, with the original conference

    paper serving as a deeper reference for

    readers interested in more detail. This

    approach inverts the historical practice

    of journals providing a more detailed

    record of conference papers, but we

    think that this Top Picks approach has

    served IEEE Microwell.

    This Top Picks issue is also unique

    among IEEE Micro editions (and possi-

    bly among all IEEE Computer Society

    publications) in that the Manuscript

    Central/ScholarOne reviewing system

    is not used for initial submissions. In-

    stead, the Program Chairs deploy their

    preferred reviewing system. Papers rec-

    ommended by the Program Committee

    for acceptance are then entered into

    Manuscript Central for the nal stages of

    processing. This separate reviewing sys-

    tem makes it easier to manage the large

    volume of submissions.

    Why go into this detail about reviewing

    software? The IEEE Computer Society

    constantly works with Thomson Reuters

    the owner of ScholarOne, to improve its

    capabilities. As part of that effort, Scholar-

    One maintains two websites to suggest

    ideas for its reviewing system and to vote

    on suggestions of others:

    Offer Suggestions: http://scholaroneideas.force.com/

    ideaListCustom

    Rate Ideas of Others: http://mchelp.manuscriptcentral.com/

    ScholarOneIdeas/howto.html

    I encourage any of you who author

    articles for IEEE Micro, or who serve as

    reviewers, to visit these sites and help

    improve ScholarOne.

    Finally, this issue continues our

    recent practice, led by Associate Editor

    .......................................................

    2 Published by the IEEE Computer Society 0272-1732/14/$31.00c 2014 IEEE

    From the Editor in Chief

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ____

    ____

    __________________

    __________

  • in Chief Lieven Eeckhout, of notingmajor

    awards. More specically, this issue

    includes a column by James Goodman

    about the work that led to his Eckert-

    Mauchly Award. Jim has many interest-

    ing and broad-ranging observations about

    his life and career, and I hope you will

    enjoy it asmuch as I did.

    With that, as with the Top Picks

    articles, happy reading!

    Erik R. Altman

    Editor in Chief

    IEEEMicro

    Erik R. Altman is the manager of the

    Dynamic Optimization Group at the Tho-

    mas J. Watson Research Center. Con-

    tact him at [email protected].

    Table 1. Mapping 2013 Top Picks articles to 2003 Top Picks categories.

    Category

    No. of 2003

    articles in

    category

    No. of 2013

    articles in

    category Articles in this issue

    Unconventional

    architectures

    3 7 A Case for Specialized Processors for Scale-Out Workloads

    Smart: Single-Cycle Multihop Traversals over a Shared Network

    on Chip

    Efficient Spatial Processing Element Control via Triggered

    Instructions

    DeNovoND: Efficient Hardware for Disciplined Nondeterminism

    Networks on Chip with Provable Security Properties

    Sonic Millip3De: An Architecture for Handheld 3D Ultrasound

    Hardware Partitioning for Big Data Analytics

    Power- and

    temperature-aware

    design

    2 2 Designing and Managing Datacenters Powered by

    Renewable Energy

    Decoupled Compressed Cache: Exploiting Spatial Locality for

    Energy Optimization*

    Reliability 2 1 A Configurable and Strong RAS Solution for

    Die-Stacked DRAMCaches*

    Cache, memory,

    and multiprocessor

    optimizations

    4 3 Cache Coherence for GPU Architectures

    Decoupled Compressed Cache: Exploiting Spatial Locality for

    Energy Optimization*

    A Configurable and Strong RAS Solution for Die-Stacked

    DRAMCaches*

    Building on conventional

    microarchitectures

    2 0 N/A

    Performance analysis 2 0 N/A

    None of the above 0 1 Quality-of-Service-Aware Scheduling in Heterogeneous

    Datacenters with Paragon...................................................................................................................................*These articles fit in two categories from 2003.

    .............................................................

    MAY/JUNE 2014 3

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ___________

    _______

    _____________

  • Guest Editors Introduction................................................................................................................................................................................................................

    TOP PICKS FROM THE 2013COMPUTER ARCHITECTURE

    CONFERENCES......It gives us great pleasure to intro-duce the special issue of the top picks fromthe computer architecture conferences of2013. The special issue presents a selectionof 12 papers that describe novel, excitingresearch directions in areas as diverse asdesign of datacenters, processors and acceler-ators, networks on chip, programmability-enhancing frameworks, and emerging largecaches.

    The review processWe received a total of 101 submissions.

    The full program committee of 30 members(see the sidebar The Selection Committee)reviewed all submissions. Each paper receivedat least four reviews (with many receiving vereviews) from the program committee. Incases where one Selection Committee chairhad a conict of interest with a submission,the other chair handled the review assign-ments. There were no papers on which bothSelection Committee chairs had conicts. Inaddition to the Selection Committee reviews,four external reviews were also sought forunique cases where we felt specic outsideexpertise was needed. Papers with high var-iance in scores were also targeted for addi-tional online discussion and, in some cases,additional reviews. We thank the committeeand the external reviewers for their time andeffort toward this valuable service to the com-puter architecture community.

    Note that, in addition to papers publishedin 2013, selected papers published in 2012

    were also eligible for inclusion in this yearsissue of Top Picks because of the conict han-dling rules of Top Picks. Under these rules,the selection committee chairs may not sub-mit their own papers in the year they serve aschair. However, their papers are eligible forfull consideration in the following year.

    We selected 41 top-ranked papers (basedon the average overall merit score for eachpaper) for discussion at the PC meeting. Fur-thermore, to minimize the impact of varia-tions in reviewer generosity, we veried thatthe 41 papers included the top-ranked papersof most individual committee members. Weencouraged the committee to championother papers for discussion that may havebeen among the top papers in their assignedreviews if such papers had not automaticallyqualied for discussion based on the overallscore. Consequently, one additional paperwas added to the discussion list, taking thetotal to 42.

    The Selection Committee discussed all 42papers at a meeting in Boston on 10 January(with 28 members attending physically andtwo participating via teleconference). Com-mittee members with conicts left the roombefore papers were discussed. The meetingwas conducted in two phases. In the rstphase, the committee voted to accept orreject papers without regard to the total num-ber of papers with the explicit understandingthat we may overshoot the target. In the sec-ond phase, the committee revisited the spe-cic shortlisted papers to arrive at the nallist of 12 papers (see the Top Picks of 2013

    Mithuna S. Thottethodi

    Purdue University

    Shubu Mukherjee

    Cavium

    .......................................................

    4 Published by the IEEE Computer Society 0272-1732/14/$31.00c 2014 IEEE

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • sidebar). We congratulate the authors on thiswell-deserved accolade.

    The selected papersThe selected papers are responsive to

    many of the pressing problems that we facetoday. The emergence of cloud computingfueled by social media networks is leading toinnovations in datacenters. The continuousneed to improve the energy efciency of theseclouds of processors, memory, and disks hasled to high performance-per-watt mecha-nisms, such as accelerator engines, betterscheduling of datacenter resources, and newstyles of processor, cache, memory, and net-work design more suited for datacenters andfuture workloads. Security continues to be anoverriding concern in this world of publicclouds and mobile computing, which has ledto innovation in the security architecture oftodays processors. As co-chairs of this IEEEMicro Top Picks issue, we are excited topresent to our audience a glimpse of howarchitects envision solving todays challengingcomputing problems.

    Maximizing the use of renewable energyto power these large datacenters is importantfrom a sustainability perspective. Designingand Managing Datacenters Powered byRenewable Energy by I~nigo Goiri et al.responds to this challenge by developingstrategies to optimally use renewable energy

    from sources that fall under the commonlyused colocation/self-generation model.

    In addition to energy efciency, it isimportant to efciently schedule availablehardware resources to maximize per-formance in datacenters, especially in chal-lenging environments where hardware istypically heterogeneous (due to rollingupgrades), and application performance isinterference prone. In Quality-of-Service-Aware Scheduling in Heterogeneous Data-centers with Paragon, Christina Delimi-trou and Christos Kozyrakis develop anovel scalable scheduling technique that isheterogeneity and interference aware to sig-nicantly boost performance (compared toan oblivious scheduling approach).

    Although the computing landscape haschanged dramatically from a desktop-and-local-software regime to cloud-based com-puting, processor designs have more or lessremained the same. A Case for SpecializedProcessors for Scale-Out Workloads byMichael Ferdman et al. argues that there is amismatch between modern processor hard-ware and the requirements of emerging cloudworkloads. This work suggests directions inprocessor design for emerging cloud work-loads. (The conference version of this paperwas published in 2012; but it was eligible forTop Picks this year, per the conict handlingrules we described earlier.)

    ..............................................................................................................................................................................................

    The Selection Committee Tor Aamodt, University of British Columbia David Albonesi, Cornell University David August, Princeton University Rajeev Balasubramonian, University of Utah Pradip Bose, IBM Doug Burger, Microsoft John Carter, IBM Joel Emer, Intel and Massachusetts Institute of Technology Babak Falsafi, Ecole Polytechnique Federale de Lausanne Antonio Gonzalez, Intel Sudhanva Gurumurthi, University of Virginia and Advanced Micro

    Devices

    Dan Jimenez, Texas A&M University David Kaeli, Northeastern University Alvin Lebeck, Duke University Hsien-Hsin Lee, Georgia Institute of Technology

    Gabriel Loh, Advanced Micro Devices Margaret Martonosi, Princeton University Kathryn Mc Kinley, Microsoft and University of Texas at Austin Milo Martin, University of Pennsylvania Trevor Mudge, University of Michigan Satish Narayanaswamy, University of Michigan Eric Rotenberg, North Carolina State University Karu Sankaralingam, University of WisconsinMadison Yanos Sazeides, University of Cyprus Simha Sethumadhavan, Columbia University Andre Seznec, INRIA Dan Sorin, Duke University Dean Tullsen, University of California, San Diego T.N. Vijaykumar, Purdue University Sudhakar Yalamanchili, Georgia Institute of Technology

    .............................................................

    MAY/JUNE 2014 5

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • Given that most of the cloud servers aremulticore servers and given the increasingimportance of the network-on-chip (NoC)fabric in such servers (the NoC latency is onevery L1 cache miss path), the performanceof the NoC becomes critical. In Smart: Sin-gle-Cycle Multihop Traversals over a SharedNetwork on Chip, Tushar Krishna et al.design an NoC that opportunistically by-passes multiple routers in a single cycle in theabsence of contention. Under ideal condi-tions, the router effectively mimics thelatency of a fully connected network even th-ough the packets traverse several hops.

    To ensure privacy and to prevent informa-tion leakage through timing channels, it isimportant to provably ensure complete tim-ing isolation. Networks on Chip with Prov-able Security Properties by Hassan M.G.Wassel et al. solves this problem for NoCs.Unlike prior QoS approaches (where a guar-anteed minimum performance is adequate),the provable timing isolation shown in thisarticle achieves stronger isolation to ensurethat there are no timing interactions amongdifferent domains.

    As GPUs move toward providing moresophisticated memory models, the lack ofviable coherence implementations remains astumbling block. Cache Coherence for

    GPU Architectures by Inderpreet Singhet al. argues that revisiting the idea of tempo-ral coherence might hold the key to efcientcache coherence implementations for GPUarchitectures.

    Die-stacked DRAM, which is on the cuspof widespread adoption, has received signi-cant attention regarding its role in the mem-ory hierarchy. However, little attention hasbeen paid to its RAS characteristics. Jae-woong Sim et al., in their article A Congu-rable and Strong RAS Solution for Die-Stacked DRAM Caches, show that ratherthan carrying over RAS solutions from tradi-tional DRAM, novel RAS solutions that arecustomized for die stacked DRAM arepreferable.

    Last-level caches are a precious resourceand, as such, there is strong motivation to usecompression to squeeze out more effectivecapacity. The article Decoupled Com-pressed Cache: Exploiting Spatial Locality forEnergy Optimization by Somayeh Sardashtiand David A. Wood overcomes key limita-tions of prior compression techniques interms of fragmentation and tag limits by lev-eraging decoupled organization.

    In the context of domain specic comput-ing, Richard Sampson et al. develop a low-power, high-performance solution for 3D

    ..............................................................................................................................................................................................

    Top Picks of 2013 Designing and Managing Datacenters Powered by Renewable

    Energy by I~nigo Goiri, William Katsak, Kien Le, Thu D. Nguyen,

    and Ricardo Bianchini

    Quality-of-Service-Aware Scheduling in Heterogeneous Datacen-ters with Paragon by Christina Delimitrou and Christos Kozyrakis

    A Case for Specialized Processors for Scale-Out Workloads byMichael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos,

    Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian

    Daniel Popescu, Anastasia Ailamaki, and Babak Falsafi

    Smart: Single-Cycle Multihop Traversals over a Shared Networkon Chip by Tushar Krishna, Chia-Hsin Owen Chen, Woo-Cheol

    Kwon, and Li-Shiuan Peh

    Networks on Chip with Provable Security Properties by HassanM.G. Wassel, Ying Gao, Jason K. Oberg, Ted Huffmire, Ryan Kast-

    ner, Frederic T. Chong, and Timothy Sherwood

    Cache Coherence for GPU Architectures by Inderpreet Singh,Arrvindh Shriraman, Wilson W.L. Fung, Mike OConnor, and Tor

    M. Aamodt

    A Configurable and Strong RAS Solution for Die-Stacked DRAMCaches by Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, and

    Mike OConnor

    Decoupled Compressed Cache: Exploiting Spatial Locality forEnergy Optimization by Somayeh Sardashti and David A. Wood

    Sonic Millip3De: An Architecture for Handheld 3D Ultrasoundby Richard Sampson, Ming Yang, Siyuan Wei, Chaitali Chakra-

    barti, and Thomas F. Wenisch

    Hardware Partitioning for Big Data Analytics by Lisa Wu,Raymond J. Barker, Martha A. Kim, and Kenneth A. Ross

    Efficient Spatial Processing Element Control via TriggeredInstructions by Angshuman Parashar, Michael Pellauer, Michael

    Adler, Bushra Ahsan, Neal Crago, Daniel Lustig, Vladimir Pavlov,

    Antonia Zhai, Mohit Gambhir, Aamer Jaleel, Randy Allmon,

    Rachid Rayess, Stephen Maresh, and Joel Emer

    DeNovoND: Efficient Hardware for Disciplined Nondeterminismby Hyojin Sung, Rakesh Komuravelli, and Sarita V. Adve

    ..............................................................................................................................................................................................

    GUEST EDITORS INTRODUCTION

    ............................................................

    6 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • ultrasound in their article, Sonic Millip3De:An Architecture for Handheld 3D Ultra-sound. Beyond the immediate application of3D ultrasound imaging, the article is a casestudy for accelerator design. The solution,which relies on hardware-algorithm codesign,develops a new accelerator architecture tobring the 3D beamforming problem withinthe desired performance/power envelope.

    Continuing with the same theme of novelaccelerators, Hardware Partitioning for BigData Analytics by Lisa Wu et al. describes alow-area-overhead hardware accelerator thatsignicantly improves data partitioning per-formance for the important class of databaseworkloads.

    In Efcient Spatial Processing ElementControl via Triggered Instructions, Angshu-man Parashar et al. target spatial acceleratorsand develop a novel approach to control owthat eliminates the performance problemsassociated with program counter-based con-trol ow used in prior spatial accelerators andarchitectures.

    The article DeNovoND: Efcient Hard-ware for Disciplined Nondeterminism byHyojin Sung et al. proposes a design thatsimplies coherence implementation via dis-ciplined coding while still allowing key non-determinism features (which is critical forlock-based codes).

    We hope that you enjoy reading thesearticles, as well as their original con-ference versions, and we welcome your feed-back on this issue. MICRO

    AcknowledgmentsWe thank Erik Altman for his support.

    We thank the web chairs Ahmed Abdel-Gawad, Timothy Pritchett, and Eric Villa-senor, who helped ensure a stable andglitch-free experience with the conferencesoftware.

    Mithuna S. Thottethodi is an associateprofessor in the School of Electrical andComputer Engineering at Purdue Univer-sity. His research interests include parallelprogramming, parallel architecture, inter-connection networks, storage, and multicore

    memory hierarchies. Thottethodi has a PhDin computer science from Duke University.He is a member of IEEE and the ACM.

    Shubu Mukherjee is a distinguished engi-neer and the lead architect for the ARMv8processor core at Cavium. His researchinterests include innovation confluencingand computer architecture. Mukherjee has aPhD in computer science from the Univer-sity of Wisconsin-Madison. He is a Fellowof IEEE and the ACM.

    Direct questions and comments about thisissue to Mithuna S. Thottethodi at [email protected] or to Shubu Mukherjee [email protected].

    .............................................................

    MAY/JUNE 2014 7

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ___________

    _____________

    _______

    _____________________

    ____

    ___________________

  • ................................................................................................................................................................................................................

    DESIGNING AND MANAGINGDATACENTERS POWERED BY

    RENEWABLE ENERGY................................................................................................................................................................................................................

    ON-SITE RENEWABLE ENERGY HAS THE POTENTIAL TO REDUCE DATACENTERS CARBON

    FOOTPRINT AND POWER AND ENERGY COSTS. THE AUTHORS BUILT PARASOL, A SOLAR-

    POWERED DATACENTER, AND GREENSWITCH, A SYSTEM FOR SCHEDULING WORKLOADS,

    TO EXPLORE THIS POTENTIAL IN A CONTROLLED RESEARCH SETTING.

    ......Datacenters range from a fewservers in a machine room to thousands ofservers housed in warehouse-size installa-tions.1 Estimates for 2010 indicate that, col-lectively, datacenters consume around 1.5percent of the total electricity used world-wide.1 This translates into high carbon emis-sions, as most of this electricity comes fromfossil fuels. A 2008 study estimated that data-centers emit 116 million metric tons of car-bon, slightly more than the entire country ofNigeria.2

    With increasing societal demand forcleaner products and services, several compa-nies have announced plans to build greendatacentersthat is, datacenters partially orcompletely powered by renewables such assolar or wind energy. These datacenters willeither generate their own renewable energy(self-generation) or draw it directly from anexisting nearby plant (colocation). For exam-ple, Apple and McGraw-Hill have built largesolar arrays for their datacenters, whereasGreen House Data is a small cloud providerthat operates entirely on renewables. Al-though there are other approaches, theseexamples suggest that many datacenters that

    seek to lower emissions will prefer colocationor self-generation. In our paper for the 18thInternational Conference on ArchitecturalSupport for Programming Languages andOperating Systems (ASPLOS 2013),3 we dis-cuss the current and expected future cost andspace needs of on-site solar and windgeneration.

    Colocation and self-generation pose aninteresting research challenge: solar and windenergy are intermittent, which requiresapproaches for tackling the energy supplyvariability. One approach is to use batteriesand/or the electrical grid as a backup for therenewable energy. It might also be possible toadapt the workload (the energy demand) tomatch the renewable energy supply.4-8 Forthe highest benets, green datacenter opera-tors must intelligently manage their work-loads and the energy sources at their disposal.For example, when the workload is deferrable(that is, it can be delayed within a timebound), it might be appropriate to delaysome of the load and store the freed-uprenewable energy in the batteries for later use(for example, to shave an expected load peakwhen the renewable energy is not available).

    I~nigo Goiri

    William Katsak

    Kien Le

    Thu D. Nguyen

    Ricardo Bianchini

    Rutgers University

    .......................................................

    8 Published by the IEEE Computer Society 0272-1732/14/$31.00c 2014 IEEE

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • As far as we know, green datacenter operatorsdo not currently manage their energy sourcesand workloads in this manner.

    We set out to build software and hardwareto explore these issues. This article overviewstwo of our main efforts: Parasol andGreenSwitch.

    ParasolFigure 1a shows Parasol, a solar-powered

    datacenter that we built as a research plat-form to study colocation and self-generation.Parasol comprises a steel structure, a smallcustom container housing two racks of serv-ers and networking equipment, an air-sideeconomizer free-cooling unit and a direct-expansion air conditioner, 16 solar panels(producing up to 3.2 kW AC), two DC/ACinverters, 16 lead-acid batteries (storing upto 32 kWh), two charge controllers, and an

    electricity grid tie. Parasol currently houses64 Atom-based servers (consuming at most30 W each), but it is large enough to house150 of them. It uses free cooling wheneveroutside temperatures and humidity are lowenough, and air conditioning otherwise. Par-asol can use solar energy directly, store it inits batteries, or feed it to the grid for credit(net metering). We thought about addinga wind turbine to Parasol, but historicalweather data shows that our location (Piscat-away, N.J.) is not windy enough.

    Figure 1b shows Parasols power distribu-tion and monitoring infrastructure. BecauseParasol was built as a research instrument forstudying power management in green data-centers, it is critical that we understand thepower usage of each component, as well aspower losses. Thus, we have power meters(labeled M in the gure), either internal tocomponents (for example, the DC/AC

    Inverter

    Mainelectrical

    panel

    Batterycontroller

    BatteriesGrid

    electricalpanel

    PDU

    M

    M

    M

    MMM

    Airconditioner

    Freecooling

    DC

    DCAC AC

    AC

    AC

    AC AC IT

    Electricalgrid

    Solar panelsM

    M

    AC

    (a)

    (b)

    Figure 1. Parasol: outside view showing the solar panels, container, and air conditioning unit

    (a); power distribution and monitoring infrastructure (b). The cooling system can be powered

    solely by the grid, or by the main electrical panel that receives power from all sources. Meters

    (M) are available for measuring the power flowing into and out of every component.

    .............................................................

    MAY/JUNE 2014 9

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • inverters) or added on externally (for exam-ple, the cooling-system meter), for measuringthe power owing into and out of every com-ponent. Parasol also includes a switch thatallows for powering the cooling system fromthe main electrical panel or only from thegrid. This enables experimentation with orwithout the cooling system loading the solarsystem and batteries.

    We describe our rationale for the Parasoldesign and the mistakes we made whilebuilding it over 16 months (at a total cost of$300,000) in our ASPLOS paper.3 In thisarticle, we report on data gathered from oper-ating Parasol over 22 months. Specically,solar generation and the IT equipmentbecame operational in April 2012, and Para-sol became fully operational in June 2012.

    Energy production and usageFigure 2 shows energy usage, net-metered

    energy, and the average inside and outsidetemperatures from April 2012 to January2014. We computed a power usage effective-ness (PUE) of 1.06 to 1.08, depending onthe computing load, owing to losses fromvarious conversions. April through June 2012show little or no grid energy consumption,because the external meters did not becomeoperational until the end of June 2012. Notethat total solar energy production is the

    sum of solar energy consumed and solarenergy net metered. This data shows thatduring the summer months Parasol producesmore than 500 kWh every month, whereasduring the winter this production is reducedto less than half. For the year spanning July2012 through June 2013, we computed anaverage solar capacity factor of 16 percent.During this time, Parasol supported work-loads used for studying GreenSwitch and sixother research projects.

    Interestingly, grid energy consumption inJuly 2012 was signicantly lower than inother months because we were experimentingwith GreenSwitch, transitioning machines tosleep, and using batteries (charged with solarenergy) to reduce brown energy consump-tion. Starting in November 2012, we raisedthe internal setpoint temperature from 27Cto 30C.

    CoolingFigure 3 shows the operation of the cool-

    ing system in Parasol during the second halfof August 2012. In this time period, the set-point for internal temperature was 30C; thedashed line shows the actual internal temper-ature, whereas the solid line shows the out-side temperature. The light gray area showsthe operation of the free-cooling unit,whereas the dark gray area shows the

    Apr. 2012 July 2012 Oct. 2012 Jan. 2013 Apr. 2013 July 2013 Oct. 2013 Jan. 2014 0.2

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    1.6

    5

    0

    5

    10

    15

    20

    25

    30

    35

    40

    Net meterSolar useGrid use Inside Outside

    Temp

    erature (C)

    Ene

    rgy

    (MW

    h)

    Figure 2. Energy consumption, net metering, and temperatures from April 2012 to January

    2014. The figure shows the seasonal patterns for both renewable energy generation and

    temperature.

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    10 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • operation of the air conditioner. Note thateven though this time period is in thesummer, the air conditioner only ran duringtwo days, when the outside temperaturesexceeded 30C. Much of the time, the free-cooling unit ran below 25 percent fan speed.

    The average PUE when including bothconversion losses and cooling overheads forParasol has been lower than 1.13, showingthat free cooling is very effective at keepingcooling overheads low. The air conditionerhas run for less than 20 days in a year, andless than 1 percent of the total time. Most ofthe time, our setpoint has been 30C, andthe typical temperatures inside Parasol (> 95percent) have ranged between 22C and30C. We have also been experimenting withnovel cooling policies and pushing the limitsof Parasol. During these experiments, theinternal temperature at the control sensor hasranged between 15C and 36C.

    Thus far, we have replaced ve hard diskdrives, two solid-state drives, and one moth-erboard. Although this data is not statisticallysignicant, it is possible that our experimentshave decreased the reliability of the ITequipment.

    Off-grid operation: Hurricane SandyIn late October 2012, the US East Coast

    was hit by Hurricane Sandy. The stormreached Rutgers University on 29 October,and the grid power and network suffered out-ages for more than 20 hours. Figure 4 showsthe behavior of Parasol and the wind speed atour location from 28 October to 1 Novem-ber. Rutgers lost power on a Monday after-noon, at the height of the measured windspeed (> 70 km/h), and it did not come backuntil the afternoon of the next day. Duringthis time, Parasol used its batteries and solarenergy to operate normally (although we didtransition half of the machines to sleepbecause they were not being used). This expe-rience demonstrates the potential for greendatacenters to operate through power outages(or in remote locations without a reliable gridpower source).

    GreenSwitchWe now discuss our research on managing

    Parasol. Specically, we describe GreenSwitch,a system for scheduling workloads, selecting

    which source of energy to use (renewable, bat-tery, and/or grid), and choosing the renewableenergy storage medium (battery or grid) ateach point in time. GreenSwitch seeks to min-imize the overall cost of grid electricity(including both grid energy and peak gridpower), while respecting the characteristics of

    0

    40

    20

    10

    30

    0

    100

    50

    75

    25

    20 25 3015 16 17 18 19 21 22 23 24 26 27 28 29Te

    mp

    erat

    ure

    (C

    )

    Sp

    eed (%

    )

    InsideOutsideAir conditioner

    Free cooling

    Figure 3. Cooling system operation from 15 August 2013 through 30 August

    2013. The setpoint for internal temperature was 30C; the air conditioneronly ran during two days, when the outside temperature exceeded 30C.

    20

    15

    10

    5

    0Sunday

    28 Oct. 2012Monday

    29 Oct. 2012Tuesday

    30 Oct. 2012Wednesday31 Oct. 2012

    0.0

    3.0

    2.0

    1.0

    0.5

    1.5

    2.5

    Win

    d s

    pee

    d (

    m/s

    )Po

    wer

    (kW

    )

    100

    50

    0

    75

    25

    Battery charg

    e level (%)

    IT load

    Battery dischargeBattery charge

    Grid useSolar use

    Battery charge level

    Figure 4. Parasols operation during Hurricane Sandy. Parasol used its

    batteries and solar energy to operate normally during a power outage of

    more than 20 hours.

    .............................................................

    MAY/JUNE 2014 11

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • the workload and battery lifetime constraints.It can also manage workloads and energy sour-ces during grid outages.

    ArchitectureFigure 5 illustrates the GreenSwitch archi-

    tecture. The predictor forecasts the workloadand the renewable energy production oneday into the future at the granularity of onehour. The solver takes these predictions andthe current battery charge level as input, andoutputs a workload schedule and an energysource and storage schedule. To computethese schedules, the solver uses analyticalmodels of workload behavior, battery use,and grid electricity cost. The congurereffects the changes prescribed by the solver.The changes may involve transitioning someservers between power states and/or changingthe conguration of the energy sources. (Wehave identied conguration parameters tothe inverters and charge controllers that giveus nearly full dynamic control of every sourceof energy available to Parasol.)

    A full iteration of GreenSwitch occursevery 15 minutes, which enables it to prop-erly control peak grid power use. (Utilitiestypically compute peak grid power use inwindows of 15 minutes.) However, Green-

    Switch checks the production of solar energyevery 3 minutes. During each of these checks,GreenSwitch runs a full iteration if there hasbeen an unexpected change in production.

    GreenSwitch evaluation on ParasolWe perform day-long experiments with

    Parasol and an implementation of Green-Switch for the Hadoop MapReduce frame-work. We study two widely different Hadooptraces, called Facebook and Nutch. Theformer derives from a larger batch-job tracefrom Facebook,9 whereas the latter is theindexing part of a Web search system.10 Weinstantiate our models with the on-peak/off-peak grid energy prices and the peak gridpower charges at our location. We assume theutility pays the wholesale price of electricityfor net metering.

    In the Facebook trace, jobs arrivethroughout the day.9 Figure 6 shows theGreenSwitch behavior when the jobs in thetrace are deferrable (each job can be delayedby up to 1 day), on 1 July 2012. The ll col-ors represent the use of the different energysources, whereas the lines are the solar energyproduction (full), the IT load (dots), the gridenergy price (dashes, y-axis on the right), andthe current peak grid power draw (dashes

    Energyavailabilityprediction

    Workloadprediction Solver

    Energy sourceschedule

    Workloadschedule

    Configurer

    Batterycharge level

    Parasol

    GreenSwitch

    Predictor

    Figure 5. GreenSwitch architecture. Rectangles with round edges are data structures. Rectangles with square borders are

    processes.

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    12 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • and dots). The white ll represents solarenergy that was produced but lost because ofinefciency. The gure shows that Green-Switch transitioned many servers to sleep inthe early hours of the day and deferred someof the load until solar energy was available.When there was no solar energy, Green-Switch drew energy from the batteries, sincethey stored enough capacity for the load thatwas not deferred. We also see that the solarenergy was enough to power the workload,charge the batteries, and feed energy to thegrid. Compared to a grid-only datacenter,GreenSwitch produced a prot of 9 percentin grid electricity cost. Given this prot,GreenSwitch would amortize the cost of thesolar setup and batteries in only 7.6 years.

    Despite seeking primarily to minimizegrid electricity cost, GreenSwitch is also suc-cessful at reducing carbon footprints. Itachieves reductions in grid energy use be-tween 36 and 100 percent in our experimentswith Facebook and Nutch, compared to agrid-only datacenter.

    Main lessons learnedWe have learned many important lessons

    in building Parasol and GreenSwitch. First,we learned that engineering contractors areunfamiliar with the state-of-the-art in data-

    center design or with research prototypes.Our inability to bridge this knowledge gapquickly (or at all) caused delays. This is achallenge for organizations that want to builddatacenters but lack the expertise.

    Because Parasol was a major undertaking,its design needed to enable research on manytopics (such as solar energy, free cooling, andwimpy servers). However, because we hadnot yet started to research every topic, weended up designing more features and exi-bility into Parasol than we might eventuallyneed. This increased costs.

    We also found that the need to collectne-grained power measurements and accu-rately estimate energy losses led to extradesign complexity. In addition, placing Para-sol on the roof of a building (instead of onthe ground) prevented shading from otherbuildings. Moreover, the cost of the roofplacement was roughly the same as that ofextending networking and power to groundlocations far enough away from buildings.

    We learned that the wimpy fans in wimpyservers can generate nontrivial temperaturedifferences across a free-cooled datacenter.Finally, and most importantly, we learnedthat building a real prototype is critical forcompletely understanding green datacenters.For example, in designing GreenSwitch, we

    0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    00:00 04:00 08:00 12:00 16:00 20:00 00:000

    0.05

    0.10

    0.15

    0.20

    Pow

    er (

    kW)

    Price ($/kW

    h)

    Battery dischargeBattery charge

    Grid useNet metering

    Solar use

    IT load

    Solar available

    Grid energy price

    Peak grid power

    Figure 6. GreenSwitch on deferrable Facebook workload. Most of the load during the night was delayed until renewable

    energy became available. Batteries were used when no renewable energy was available.

    .............................................................

    MAY/JUNE 2014 13

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • detected instability in our charge controllerswhen switching power sources. As a result,GreenSwitch performs these switches in steps,with some idle time in between. Such effectswould have been overlooked in simulation.

    Potential long-term impactWe expect Parasol and GreenSwitch to

    have a lasting impact on both academia andindustry for several reasons.

    Renewable energyAs we mentioned earlier, several compa-

    nies are starting to invest in datacenter colo-cation and self-generation. Regardless ofwhether theyre making these investments formarket positioning, public relations, cost, orenvironmental reasons, the fact is that theyare expecting bottom-line benets fromthem. Moreover, despite their decreasing butstill-high capital costs, exploiting renewablesin datacenters could reduce overall energycosts, peak grid power costs, or both, as ourASPLOS paper explains. We expect that anincreasing number of companies will see ben-ets in exploiting renewables.

    Some research groups have also startedstudying colocated and self-generating data-centers.4,5,7,11,12 These studies have beenattracting the attention of a growing com-munity, with publications in venues such asthe International Symposium on ComputerArchitecture (ISCA) and the InternationalConference on Architectural Support forProgramming Languages and Operating Sys-tems (ASPLOS). We expect that our designand experience with Parasol will acceleratethis growth, as researchers realize that theycan build nontrivial prototypes at relativelylow cost. Moreover, our analysis of solar andwind energy cost and space requirements sug-gests that green datacenters will becomeincreasingly attractive.3

    More broadly than datacenters, our expe-rience will likely encourage more researchersto consider the implications of external sig-nals (such as variable-electricity pricing andavailability) on computing and communica-tion in general.

    Green datacenter prototypeThere has been a dearth of real platforms

    for the study of colocated and self-generating

    green datacenters. Parasol addresses this needand is the rst platform of its kind. Priorstudies have had to resort to simulations orsmall implementations. In our ASPLOSpaper,3 we list instances in which such alter-natives would have hidden important effects.We mentioned instability issues earlier.Another example is that energy losses (forexample, in power conversion) are highlydependent on load, rather than a xed per-centage, as often assumed in simulation.These instances will encourage researchers tobuild prototypes for their studies. We expectthe Parasol design to serve as a model forthese future research prototypes. Moreover,Parasol enables research on various importanttopics, including solar energy and its impacton computing, energy storage and its abilityto lower costs, free cooling and its impact onreliability, wimpy servers and their perform-ance/energy trade-offs, and the developmentof distributed storage systems using solid-state drives. These topics are of interest toboth industry and academia.

    In its current form, Parasol is a blueprintfor industry to build small-scale, low-densitygreen datacenters for enterprises and educa-tional institutions. Self-generating containersare cheaper and more practical to operate,and can be placed in less-valuable locationsthan in machine rooms inside existing build-ings. Parasol is also suitable for remotedeployments with poor or no access to elec-tricity (networking might need to take placeover satellite in this case).

    Energy source and storage manager forgreen datacenters

    GreenSwitch simultaneously managesworkload demand, multiple energy sources(renewable, battery, and grid), and multipleenergy stores (battery and grid). Our resultsshow that it is consistently effective at reducinggrid electricity costs and carbon footprints.

    Although often overlooked in academia,simplicity and adaptability are key require-ments for practical adoption by industry. Wedesigned GreenSwitch to have both proper-ties. Specically, it uses simple models ofsolar energy availability, energy demand, andbattery behavior. In addition, although ourcurrent implementation targets Hadoop,GreenSwitch is modular in that only one

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    14 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • component (the congurer) is specic to theunderlying computing framework.

    Research avenuesParasol and GreenSwitch create many new

    research avenues. For example, Parasol ena-bles the study of the interplay between solarenergy and free cooling; interestingly, solarenergy is most abundant when the outsidetemperature is hottest (that is, when standardchiller-based cooling might be necessary inwarm climates). As another example, Green-Switch demonstrates the benets of aggressiveand coordinated management of energy sour-ces and stores and workload execution, as wellas the interplay between using batteries forpowering the workload and for storing renew-able energy. Prior work on aggressive use ofbatteries did not consider renewables.13

    D atacenters that are partially poweredby renewable energy represent anincreasingly interesting research topic frommany perspectives. In this article, we havedescribed Parasol, a solar-powered datacenterthat we have built as a research platform, andour experience in constructing and operatingParasol. We have also described GreenSwitch,a workload and power source managementsystem. As we mentioned earlier, Parasol andGreenSwitch enable the exploration of manyresearch avenues. We are currently studyingthe behavior and management of free-cooleddatacenters, as well as the interaction betweensolar energy and free cooling. We are alsostudying the design of green energy-awarelatency-sensitive applications, such as cloud-based distributed storage systems. Speci-cally, we are exploring how to design systemsthat can maintain service-level objectives (forexample, a desired 99th percentile responsetime), while maximizing usage of renewableenergy and minimizing usage of brownenergy. In conclusion, we hope that our expe-rience with Parasol and GreenSwitch willentice other researchers and practitioners toconsider these datacenters. MICRO

    AcknowledgmentsWe thank Abhishek Bhattacharjee, David

    Meisner, Santosh Nagarakatte, Anand Sivasu-bramaniam, and Thomas F.Wenisch for com-

    ments that helped us improve this article. Weare also grateful to our sponsors, NSF grantCSR-1117368, and the Rutgers Green Com-puting Initiative. Finally, we are indebted toJoan Stanton, Heidi Szymanski, Jon Tenen-baum, Chuck Depasquale, SMA America,andMichael J. Pazzani for their extensive helpin building and funding Parasol.

    ....................................................................References1. J. Koomey, Growth in Data Center Electric-

    ity Use 2005 to 2010, Analytic Press, 2011.

    2. J. Mankoff, R. Kravets, and E. Blevis,

    Some Computer Science Issues in Creat-

    ing a Sustainable World, Computer, vol.

    41, no. 8, 2008, pp. 102-105.

    3. I. Goiri et al., Parasol and GreenSwitch:

    Managing Datacenters Powered by Renew-

    able Energy, Proc. 18th Intl Conf. Architec-

    tural Support for Programming Languages

    and Operating Systems (ASPLOS 13), 2013,

    pp. 51-64.

    4. B. Aksanli et al., Utilizing Green Energy

    Prediction to Schedule Mixed Batch and

    Service Jobs in Data Centers, Proc. 4th

    Workshop Power-Aware Computing and

    Systems (HotPower 11), 2011, article no. 5.

    5. I. Goiri et al., GreenSlot: Scheduling Energy

    Consumption in Green Datacenters, Proc.

    Intl Conf. High Performance Computing,

    Networking, Storage and Analysis (SC 11),

    2011, article no. 20.

    6. I. Goiri et al., GreenHadoop: Leveraging

    Green Energy in Data-Processing Frame-

    works, Proc. 7th ACM European Conf.

    Computer Systems (EuroSys 12), 2012,

    pp. 57-70.

    7. A. Krioukov et al., Integrating Renewable

    Energy Using Data Analytics Systems: Chal-

    lenges and Opportunities, Data Eng. Bulle-

    tin, vol. 34, no. 1, 2011, pp. 3-11.

    8. Z. Liu et al., Renewable and Cooling Aware

    Workload Management for Sustainable

    Data Centers, Proc. 12th ACM SIGMET-

    RICS/PERFORMANCE Joint Intl Conf.

    Measurement and Modeling of Computer

    Systems, 2012, pp. 175-186.

    9. Y. Chen et al., The Case for Evaluating

    MapReduce Performance Using Workload

    Suites, Proc. Modeling, Analysis &.............................................................

    MAY/JUNE 2014 15

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • Simulation of Computer and Telecommunica-

    tion Systems (MASCOTS), 2011, pp. 390-

    399.

    10. EPFL, CloudSuite, 2012; http://parsa.epfl.

    ch/cloudsuite/cloudsuite.html.

    11. C. Li, A. Qouneh, and T. Li, iSwitch: Coordi-

    nating and Optimizing Renewable Energy

    Powered Server Clusters, Proc. 39th Ann.

    Intl Symp. Computer Architecture (ISCA

    12), 2012, pp. 512-523.

    12. N. Sharma et al., Blink: Managing Server

    Clusters on Intermittent Power, Proc. 16th

    Intl Conf. Architectural Support for Pro-

    gramming Languages and Operating Sys-

    tems (ASPLOS 11), 2011, pp. 185-198.

    13. S. Govindan et al., Leveraging Stored

    Energy for Handling Power Emergencies in

    Aggressively Provisioned Datacenters, Proc.

    17th Intl Conf. Architectural Support for Pro-

    gramming Languages and Operating Sys-

    tems (ASPLOS 12), 2012, pp. 75-86.

    I~nigo Goiri is a research associate in theDepartment of Computer Science at RutgersUniversity. His research interests includeenergy-efficient datacenter design and virtuali-

    zation. Goiri has a PhD in computer sciencefrom the Universitat Politecnica de Catalunya.

    William Katsak is a PhD student in theDepartment of Computer Science at RutgersUniversity. His research focuses on powermanagement of datacenters. Katsak has anMS in computer science from Rutgers Uni-versity. He is a student member of IEEE andthe ACM.

    Kien Le is a software engineer at A10 net-works. His research focuses on building acost-aware load distribution framework toreduce energy consumption and promoterenewable energy. Le has a PhD in computerscience from Rutgers University, where hecompleted the work for this article.

    Thu D. Nguyen is an associate professor inthe Department of Computer Science atRutgers University. His research interestsinclude green computing, distributed andparallel systems, operating systems, andinformation retrieval. Nguyen has a PhD incomputer science and engineering from theUniversity of Washington. He is a memberof IEEE and the ACM.

    Ricardo Bianchini is a professor in theDepartment of Computer Science at RutgersUniversity. He is currently on leave fromRutgers and working as the chief efficiencystrategist at Microsoft. His research interestsinclude the power, energy, and thermal man-agement of servers and datacenters. Bianchinihas a PhD in computer science from the Uni-versity of Rochester. He is an ACM distin-guished scientist and a senior member ofIEEE.

    Direct questions and comments about thisarticle to I~nigo Goiri, Department of Com-puter Science, Rutgers University, 110 Fre-linghuysen Road, Piscataway, NJ 08854-8019; [email protected].

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    16 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    _________________

    ________________

    _____________

    ____________

    _______

  • ..................................................................................................................................................................................................................

    QUALITY-OF-SERVICE-AWARESCHEDULING IN HETEROGENEOUSDATACENTERS WITH PARAGON

    ..................................................................................................................................................................................................................

    PARAGON, AN ONLINE, SCALABLE DATACENTER SCHEDULER, ENABLES BETTER CLUSTER

    UTILIZATION AND PER-APPLICATION QUALITY-OF-SERVICE GUARANTEES BY LEVERAGING

    DATA MINING TECHNIQUES THAT FIND SIMILARITIES BETWEEN KNOWN AND NEW

    APPLICATIONS. FOR A 2,500-WORKLOAD SCENARIO, PARAGON PRESERVES PERFORMANCE

    CONSTRAINTS FOR 91 PERCENT OF APPLICATIONS, WHILE SIGNIFICANTLY IMPROVING

    UTILIZATION. IN COMPARISON, A BASELINE LEAST-LOADED SCHEDULER ONLY PROVIDES

    SIMILAR GUARANTEES FOR 3 PERCENT OF WORKLOADS.

    ......Efciency is a rst-class require-ment and the main source of scalability con-cerns both for small and large systems.1,2

    Achieving high efciency is not only a matterof sensible design, but also a function of howthe system is managed, which becomes essen-tial as the hardware grows progressively heter-ogeneous and parallel and applications getdynamic and diverse. Architecture has tradi-tionally been about efcient system design.As efciency increases in importance, archi-tecture should be about both design andmanagement for systems of any scale.

    In this article, we focus on improving ef-ciency while guaranteeing high performancein large-scale systems. Although an increasingamount of computing now happens in publicand private clouds, such as Amazon ElasticCompute Cloud (EC2; see http://aws.amazon.com/ec2) or vSphere (www.vmware.

    com/products/vsphere), datacenters continueto operate at utilizations in the single dig-its.1,3 This lessens the two main advantagesof cloud computingexibility and cost ef-ciency both for cloud operators and endusersbecause not only are the machinesunderutilized, they are also operating in anon-energy-proportional region.1,4

    There can be several reasons why ma-chines are underutilized. Two of the mostprominent obstacles are interference betweencoscheduled applications and heterogeneityin server platforms. For more information,see the Interference and Heterogeneitysidebar.

    In our paper presented at the 18th Inter-national Conference on Architectural Sup-port for Programming Languages andOperating Systems (ASPLOS 2013),5 weintroduced Paragon, an online and scalable

    Christina Delimitrou

    Christos Kozyrakis

    Stanford University

    0272-1732/14/$31.00c 2014 IEEE Published by the IEEE Computer Society.............................................................

    17

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • ..............................................................................................................................................................................................

    Interference and HeterogeneityInterference occurs as coscheduled applications contend in shared

    resources. Coscheduled applications may interfere negatively even if

    they run on different processor cores because they share caches,

    memory channels, storage, and networking devices.1,2 If unmanaged,

    interference can result in performance degradations of integer fac-

    tors,2 especially when the application must meet tail latency guaran-

    tees apart from average performance.3 Figure A shows that an

    interference-oblivious scheduler will slow workloads down by 34 per-

    cent on average, with some running more than two times slower. This

    is undesirable for both users and operators.

    Heterogeneity is the natural result of the infrastructures evolu-

    tion, as servers are gradually provisioned and replaced over the typical

    15-year lifetime of a datacenter.4-7 At any point in time, a datacenter

    may host three to five server generations with a few hardware config-

    urations per generation, in terms of the processor speed, memory,

    storage, and networking subsystems. Managing the different hard-

    ware incorrectly not only causes significant performance degradations

    to applications sensitive to server configuration, but also wastes

    resources as workloads occupy servers for significantly longer, and

    gives a low-quality signal to hardware vendors for the design of future

    platforms. Figure A shows that a heterogeneity-oblivious scheduler

    will slow applications down by 22 percent on average, with some run-

    ning nearly 2 times slower (see the Methodology section in the

    main article).

    Finally, a baseline scheduler that is oblivious to both interference

    and heterogeneity and which schedules applications to least-loaded

    servers is even worse (48 percent average slowdown), causing some

    workloads to crash due to resource exhaustion on the server. Unless

    interference and heterogeneity are managed in a coordinated fashion,

    the system loses both its efficiency and predictability guarantees. Pre-

    vious research has identified the issues of heterogeneity6 and inter-

    ference,2 but while most cloud management systemssuch as

    Mesos8 or vSphere (www.vmware.com/products/vsphere)have

    some notion of contention or interference awareness, they either use

    empirical rules for interference management or assume long-running

    workloads (for example, online services), whose repeated behavior

    can be progressively modeled. In this article, we target both heteroge-

    neity and interference and assume no a priori analysis of the applica-

    tion. Instead, we leverage information the system already has about

    the large number of applications it has previously seen.

    References1. S. Govindan et al., Cuanta: Quantifying Effects of Shared

    On-Chip Resource Interference for Consolidated Virtual

    Machines, Proc. 2nd ACM Symp. Cloud Computing, 2011,

    article no. 22.

    2. J. Mars et al., Bubble-Up: Increasing Utilization in Modern

    Warehouse Scale Computers via Sensible Co-locations,

    Proc. 44th Ann. IEEE/ACM Intl Symp. Microarchitecture,

    2011, pp. 248-259.

    3. D. Meisner et al., Power Management of Online Data-Inten-

    sive Services, Proc. 38th Ann. Intl Symp. Computer Archi-

    tecture (ISCA 11), 2011, pp. 319-330.

    4. L.A. Barroso and U. Holzle, The Datacenter as a Computer:

    An Introduction to the Design of Warehouse-Scale

    Machines, Morgan and Claypool Publishers, 2009.

    5. C. Kozyrakis et al., Server Engineering Insights for Large-Scale

    Online Services, IEEEMicro, vol. 30, no. 4, 2010, pp. 8-19.

    6. J. Mars, L. Tang, and R. Hundt, Heterogeneity in Homoge-

    neous Warehouse-Scale Computers: A Performance Oppor-

    tunity, IEEE Computer Architecture Letters, vol. 10, no. 2,

    2011, pp. 29-32.

    7. R. Nathuji, C. Isci, and E. Gorbatov, Exploiting Platform Het-

    erogeneity for Power Efficient Data Centers, Proc. 4th Intl

    Conf. Autonomic Computing (ICAC 07), 2007, doi:10.1109/

    ICAC.2007.16.

    8. B. Hindman et al., Mesos: A Platform for Fine-Grained

    Resource Sharing in the Data Center, Proc. 8th USENIX

    Conf. Networked Systems Design and Implementation,

    2011, article no. 22.

    1.0

    Alone on best platform No interferenceLeast loadedNo heterogeneity

    Sp

    eed

    up o

    ver

    alon

    e on

    bes

    t pla

    tform

    0.8

    0.6

    0.4

    0.2

    0.00 1,000 2,000

    Workloads3,000 4,000 5,000

    Figure A. Performance degradation for 5,000 applications

    on 1,000 Amazon Elastic Compute Cloud (EC2) servers with

    heterogeneity-oblivious, interference-oblivious, and

    baseline least-loaded schedulers compared to ideal

    scheduling (application runs alone on best platform).

    Results are ordered fromworst- to best-performing

    workload.

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    18 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ____________________

  • datacenter scheduler that accounts for hetero-geneity and interference. The key feature ofParagon is its ability to quickly and accuratelyclassify an unknown application with respectto heterogeneity (which server congurationsit will perform best on) and interference(how much interference it will cause tocoscheduled applications and how muchinterference it can tolerate itself in multipleshared resources). Unlike previous techniquesthat require detailed proling of each in-coming application, Paragons classicationengine exploits existing data from previouslyscheduled workloads and requires only aminimal signal about a new workload. Spe-cically, it is organized as a low-overhead rec-ommendation system similar to the onedeployed for the Netix Challenge,6 butinstead of discovering similarities in usersmovie preferences, it nds similarities inapplications preferences with respect to het-erogeneity and interference. It uses singularvalue decomposition (SVD) to perform col-laborative ltering and identify similaritiesbetween incoming and previously scheduledworkloads.

    Once an incoming application is classi-ed, a greedy scheduler assigns it to the serverthat is the best possible match in terms ofplatform and minimum negative interferencebetween all coscheduled workloads. Eventhough the nal step is greedy, the high accu-racy of classication leads to schedules thatachieve both fast execution time and efcientresource usage. Paragon scales to systemswith tens of thousands of servers and tens ofcongurations, running large numbers ofpreviously unknown workloads. We imple-mented Paragon and showed that it signi-cantly improves cluster utilization, whilepreserving per-application quality-of-service(QoS) guarantees both for small- and large-scale systems. For more information onrelated work, see the Research Related toParagon sidebar.

    Fast and accurate classificationThe key requirement for heterogeneity

    and interference-aware scheduling is toquickly and accurately classify incomingapplications. First, we need to know how fastan application will run on each of the tens of

    server congurations (SCs) available. Second,we need to know how much interference itcan tolerate from other workloads in each ofseveral shared resources without signicantperformance loss and how much interferenceit will generate itself. Our goal is to performonline scheduling for large-scale systemswithout any a priori knowledge about incom-ing applications. Most previous schemesaddress this issue with detailed but ofineapplication characterization or long-termmonitoring and modeling.7-9 Paragon takes adifferent approach. Its core idea is that,instead of learning each new workload indetail, the system leverages information italready has about applications it has seen toexpress the new workload as a combinationof known applications. For this purpose, weuse collaborative ltering techniques thatcombine a minimal proling signal about thenew application with the large amount ofdata available from previously scheduledworkloads. The result is fast and accurateclassication of incoming applications withrespect to heterogeneity and interference.Within a minute of its arrival, an incomingworkload is scheduled on a large-scale cluster.

    Background on collaborative filteringCollaborative ltering techniques are fre-

    quently used in recommendation systems.We use one of their most publicized applica-tions, the Netix Challenge,6 to provide aquick overview of the two analytical methodswe rely on, SVD and PQ reconstruction.10

    In this case, the goal is to provide valid movierecommendations for Netix users given theratings they have provided for various othermovies.

    The input to the analytical framework is asparse matrix A, the utility matrix, with onerow per user and one column per movie. Theelements of A are the ratings that users haveassigned to movies. Each user has rated onlya small subset of movies; this is especially truefor new users, who might only have a handfulof ratings, or even none. Although techniquesexist that address the cold-start problem (thatis, providing recommendations to a com-pletely fresh user with no ratings), we focushere on users for whom the system has someminimal input. If we can estimate the valuesof the missing ratings in the sparse matrix A,

    .............................................................

    MAY/JUNE 2014 19

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

  • we can make movie recommendations; thatis, we can suggest that users watch the moviesfor which the recommendation system esti-mates they will give high ratings to with highcondence.

    The rst step is to apply SVD, a matrixfactorization method used for dimensionalityreduction and similarity identication. Fac-toring A produces the decomposition to thefollowing matrices of left (U) and right (V)

    ..............................................................................................................................................................................................

    Research Related to ParagonWe discuss work relevant to Paragon in the areas of datacenter

    scheduling, virtual machine (VM) management, workload rightsizing,

    and scheduling for heterogeneous multicore chips.

    Datacenter schedulingRecent work on datacenter scheduling has highlighted the impor-

    tance of platform heterogeneity and workload interference. Mars

    et al. showed that the performance of Google workloads can vary by

    up to 40 percent because of heterogeneity, even when considering

    only two server configurations, and by up to 2 times because of inter-

    ference, even when considering only two colocated applications.1,2

    Govindan et al. also present a scheme to quantify the effects of cache

    interference between consolidated workloads.3 In Paragon, we extend

    the concepts of heterogeneity- and interference-aware scheduling by

    providing an online, scalable, and low-overhead methodology that

    accurately classifies applications for both heterogeneity and interfer-

    ence across multiple resources.

    VM managementSystems such as vSphere (http://www.vmware.com/products/

    vsphere) or the VM platforms on public cloud providers can schedule

    diverse workloads submitted by users on the available servers. In gen-

    eral, these platforms account for application resource requirements

    that they expect the user to express or they learn over time by moni-

    toring workload execution. Paragon can complement such systems by

    making scheduling decisions on the basis of heterogeneity and inter-

    ference and detecting when an application should be considered for

    rescheduling.

    Resource management and rightsizingThere has been significant work on resource allocation in virtual-

    ized and nonvirtualized large-scale datacenters. Mesos performs

    resource allocation between distributed computing frameworks such

    as Hadoop or Spark.4 Rightscale (http://www.rightscale.com) auto-

    matically scales out three-tier applications to react to changes in the

    load in Amazons cloud service. DejaVu serves a similar goal by identi-

    fying a few workload classes and, based on them, reusing previous

    resource allocations to minimize reallocation overheads.5 In general,

    Paragon is complementary to rightsizing systems. Once such a system

    determines the amount of resources needed by an application, Para-

    gon can classify and schedule it on the proper hardware platform in a

    way that minimizes interference.

    Scheduling for heterogeneous multicore chipsScheduling in heterogeneous CMPs shares some concepts and

    challenges with scheduling in heterogeneous datacenters; thus, some

    of the ideas in Paragon can be applied in heterogeneous CMP sched-

    uling as well. Shelepov et al. present a scheduler for heterogeneous

    CMPs that is simple and scalable,6 whereas Craeynest et al. use per-

    formance statistics to estimate which workload-to-core mapping is

    likely to provide the best performance.7 Given the increasing number

    of cores per chip and coscheduled tasks, techniques similar to the

    ones used in Paragon can be applicable when deciding how to sched-

    ule applications in heterogeneous CMPs as well.

    References1. J. Mars, L. Tang, and R. Hundt, Heterogeneity in Homoge-

    neous Warehouse-Scale Computers: A Performance Oppor-

    tunity, IEEE Computer Architecture Letters, vol. 10, no. 2,

    2011, pp. 29-32.

    2. J. Mars et al., Bubble-Up: Increasing Utilization in Modern

    Warehouse Scale Computers via Sensible Co-locations,

    Proc. 44th Ann. IEEE/ACM Intl Symp. Microarchitecture,

    2011, pp. 248-259.

    3. S. Govindan et al., Cuanta: Quantifying Effects of Shared

    On-Chip Resource Interference for Consolidated Virtual

    Machines, Proc. 2nd ACM Symp. Cloud Computing, 2011,

    article no. 22.

    4. B. Hindman et al., Mesos: A Platform for Fine-Grained

    Resource Sharing in the Data Center, Proc. 8th USENIX

    Conf. Networked Systems Design and Implementation,

    2011, article no. 22.

    5. N. Vasic et al., DejaVu: Accelerating Resource Allocation in

    Virtualized Environments, Proc. 17th Intl Conf. Architec-

    tural Support for Programming Languages and Operating

    Systems, 2012, pp. 423-436.

    6. D. Shelepov et al., HASS: A Scheduler for Heterogeneous

    Multicore Systems, ACM SIGOPS Operating Systems

    Rev., vol. 43, no. 2, 2009, pp. 66-75.

    7. K. Craeynest et al., Scheduling Heterogeneous Multi-Cores

    through Performance Impact Estimation (PIE), Proc. 39th

    Ann. Intl Symp. Computer Architecture (ISCA 12), 2012,

    pp. 213-224.

    ..............................................................................................................................................................................................

    TOP PICKS

    ............................................................

    20 IEEE MICRO

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    Previous Page | Contents | Zoom in | Zoom out | Front Cover | Search Issue | Next PageIEEE

    micro

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    qqMM

    qqMM

    qM

    QmagsTHE WORLDS NEWSSTAND

    ____

  • singular vectors and the diagonal matrix ofsingular values (R):

    Am;n

    a1;1 a1;2 a1;na2;1 a2;2 a2;n... ..

    . . .. ..

    .

    am;1 am;2 am;n

    0BBBB@

    1CCCCA

    U R V Twhere

    Umr u1;1 u1;r... . .

    . ...

    um;1 um;r

    0BB@

    1CCA;

    V nr v1;1 v1;r... . .

    . ...

    vn;1 vn;r

    0BB@

    1CCA;

    Rrr r1 0... . .

    . ...

    0 rr

    0BB@

    1CCA

    Dimension r is the rank of matrix A, andit represents the number of similarity con-cepts identied by SVD. For instance, onesimilarity concept might be that certain mov-ies belong to the drama category, whileanother might be that most users who likedthe movie The Lord of the Rings: The Fellow-ship of the Ring also liked The Lord of theRings: The Two Towers. Similarity conceptsare represented by singular values ri inmatrix R and the condence in a similarityconcept by the magnitude of the correspond-ing singular value. Singular values in R areordered by decreasing magnitude. Matrix Ucaptures the strength of the correlationbetween a row of A and a similarity concept.In other words, it expresses how users relateto similarity concepts such as the one aboutliking drama movies. Matrix V captures thestrength of the correlation of a column of Ato a similarity concept. In other words, towhat extent does a mo