sustainable information technology ecosystem information technology ecosystem supply and demand side...

33
© 2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Sustainable Information Technology Ecosystem supply and demand side management Chandrakant D. Patel HP Fellow [email protected] HP Laboratories, Hewlett-Packard Company

Upload: doannga

Post on 07-Jul-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

© 2006 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Sustainable Information Technology Ecosystemsupply and demand side management

Chandrakant D. PatelHP [email protected]

HP Laboratories, Hewlett-Packard Company

2 7 March 2008 Chandrakant Patel, [email protected]

Drivers of Next Generation of IT ServicesUp and coming Small and Medium Enterprises

IT Advantage in the Emerging Economies• Enabling Business Transformation

Retail Example: Subhiksha Trading Services

“IT will be the key enabler and the key differentiator between operations that are or aren’t well run. Whether it is managing the front end or logistics support or in stock and inventory management”.- R. Subramaniam, MD and Founder of Subhiksha Trading Services

- The Economic Times, March 13, 2007

Subhiksha Store

Vadodara, Gujarat, India

3 7 March 2008 Chandrakant Patel, [email protected]

Drivers of Next Generation of IT ServicesMicro businesses in emerging economies

IT services as a means for micro-businesses to improve quality of life if provided at a given price point

4 7 March 2008 Chandrakant Patel, [email protected]

Sustainable IT Ecosystem Deconstruct conventional supply chain and replace with Sustainable IT ecosystem

5 7 March 2008 Chandrakant Patel, [email protected]

The Road AheadNext Generation IT Services• IT and telecommunications is a means to overcome physical infrastructure

issues such as transportation, increase productivity and improve the standard of living.

• Necessitates a total cost of ownership (TCO) that enables growth in new areas e.g. 800 million in India who want to use IT as a means to improve quality of life and business competitiveness.− Achieving the TCO necessitates management of available energy in the

IT ecosystem as a key resource.

Devising a Sustainable IT Ecosystem− Evolve an “end to end” metric and process to address sustainability in IT− Design, manufacture, operate and manage end of life of products that

minimize consumption of available energy – a “cradle to cradle”perspective

6 7 March 2008 Chandrakant Patel, [email protected]

Devising a Sustainable IT Ecosystemsupply side + demand side management

Ground State

Reclaim

irreversibility

Framework needed to quantify the consumption of available energy (exergy)

7 7 March 2008 Chandrakant Patel, [email protected]

Exergy or Available Energy

• “Exergy” is a measure of the quality of energy−Or, alternatively, the amount of work available from a

given amount of energy (‘available energy’)

• From the 2nd Law of Thermodynamics: − Irreversibilities in real processes continuously degrade

the quality of energy, simultaneously destroying exergy−For example, the conversion of coal to electricity

• Or electricity to heat• Or rejection of heat to ambient

8 7 March 2008 Chandrakant Patel, [email protected]

Approach

• Can a measure of the total exergy destroyed across a product’s lifetime (“lifetime exergy”) be a measure of the environmental sustainability?

manufacturing

exergy toproducematerials

exergy tofabricate and assemble components

exergy fortransportation

operationdelivery

exergy forinstallation

exergy to power electronics and thermal managment

exergy formaintainanceand repair

exergy formaintainanceand repair

retirement

exergy tode-install anddecommision

exergy todisassembleand/orrecondition for safe disposal or recycle

time

9 7 March 2008 Chandrakant Patel, [email protected]

OperationPowering the electronics and thermal Management

Exergy consumed:• Electronics and electro-mechanical systems• Work required to remove heat

− Flow Work • volume flow, m3/s• pressure drop, Pa

− Thermodynamic Work• Temperature, °C

Heat Sink

Epoxy Glass Printed Circuit Board

Tin, Tout

Operation

10 7 March 2008 Chandrakant Patel, [email protected]

Heat Flux, W/cm2

Technology TrendsTechnology Trendschip packaging, integrated photonics, semiconductor technologychip packaging, integrated photonics, semiconductor technology

1998 2006

10

100

1000Frequency

Device Size

# of Devices

Voltage

Capacitance per device

per chip

2010

PGAEnhanced Ball Grid Array

Flip Chip Array, Underfill Flip Chip MCM

Flip Chip, Non Uniform Heat Distribution, Power Mgmt

Pow

er, W

200

3d 3d pkgpkg

400

# of Cores

• High Power Density

• Temperature Control due to optical devices

2002

Source: Chandrakant Patel projections, non industry data

11 7 March 2008 Chandrakant Patel, [email protected]

Thermal Management Challengestacked devices: chip and package scale

Heat SinkPassive e.g. body of the handheld

Q, W

High power density microprocessor with stacked devices

12 7 March 2008 Chandrakant Patel, [email protected]

Chip, Package and System ScaleWork required to remove the heat

Air (25°-40 °C)

Winterface

Demise of Passive Only Cooling Solution• Work Required at the chip interface level due to high power density and advent of new packaging

Interface

13 7 March 2008 Chandrakant Patel, [email protected]

Data Center ScaleWork required to remove heat Drive

T inTout

14 7 March 2008 Chandrakant Patel, [email protected]

Modular AC Unit

Cool Fluid Hot Fluid

Data Center CoolingConventional approach in controlling temperature in the data center

Data Center CoolingConventional approach in controlling temperature in the data center

Rack

Cooling Coil

Air Mover

RaisedFloor

Treturn at 20 °C

• Single point temperature measurement at the return of the CRAC (Computer Room Air Conditioning Unit), typically set at 20 oC

Tin ~ 14 °C

15 7 March 2008 Chandrakant Patel, [email protected]

IT EcosystemBillions of Users and 1000s of Data Centers

Wchip

Wsys

Wpump

Wb

Wpump Wcompressor

Wb

Qchip

Qsystem

( ) system Cooling of WorkmicThermodyna Work FlownDissipatioHeat Total COP

+=ensemble

Patel, C.D., Sharma, R.K., Bash, C.E., Beitelmal, M, “Energy Flow in the Information Technology Stack: Introducing the Coefficient of Performance of the Ensemble”, ASME International Mechanical Engineering Congress & Exposition, November 5-10, 2006, Chicago, Illinois

Coefficient of Performance of the Ensemble

Wensemble

Qdata center

16 7 March 2008 Chandrakant Patel, [email protected]

Coefficient of Performance of the Ensemble

( )∑ ∑∑∑ ∑ ++++

+

+

=

−−k

ctcompm

pl

crbj

ri

devcp

dcG

WWWWWWW

QCOPsup

Cooling Tower loop

Chiller Refrigerant loop

Chilled Water loop

Data Center CRAC units

Warm Water

Air Mixture In

QCond

Wcomp

Air Mixture In

Return Water

QEvap

Cooling tower wall are adiabatic, Q = 0

Makeup Water

Air Mixture Out

Wp

Wp

( )∑=ondarypdchydronics WQCOP

sec

compWchQ

chCOP =

−= 1

/)1(

2

3)1(

22nn

P

P

nmotor

nPrefmcompW

pηη

ν&

ct

compdcctctct W

WQWQCOP

+==

Patel, C.D., Sharma, R.K., Bash, C.E., Beitelmal, M, “Energy Flow in the Information Technology Stack: Introducing theCoefficient of Performance of the Ensemble”, ASME International Mechanical Engineering Congress & Exposition, November 5-10, 2006, Chicago, Illinois

17 7 March 2008 Chandrakant Patel, [email protected]

ImpactFlow Work + Thermodynamic Work

Heat Generated

Energy to Remove Heat

2 MW

2 MW

10 to 100 W

Services

20 GWTotal

Global Power Consumption

Annualized Destruction of Available Resources – Coal

2-40 W

48 GW Assume:

5000 data centers

Assume: 4 billion handhelds at 12 W

200 Million Metric Tons of coal

85 Million Metric Tons

175 Million Metric Tons CO2 emission

Operation

18 7 March 2008 Chandrakant Patel, [email protected]

Data Center Total Cost of Ownership

( ) ( ) ( )1$,12112

2 1,$ σ+++++++

= depavgtotal

hardwareconsumedgridcriticaltotal ITSMRPULKLKftA

ftCost

Personnel, equipment, SW per rackBurdened power consumption

Real Estate

• Responsiveness• Energy Consumption

Factors that impact the “Burdened Power Cost”

J1 : capacity utilization factor, i.e. ratio of maximum design (rated) power consumption to the actual data center power consumption

K1 = F(J1): burdened power delivery factor, i.e. ratio of amortization and maintenance costs of the power delivery systems to the cost of grid power

K2 = F(J1): burdened cooling cost factor, i.e. ratio of amortization and maintenance costs of the cooling equipment to the cost of grid power

L1: cooling load factor, i.e. ratio of power consumed by cooling equipment to the power consumed by compute, storage and networking hardware (inverse of COPensemble) D

epre

ciat

ion

fact

ors

↓ J1

↓ L1

Lack of vital information from single-input single-output environmental control systems

Patel and Shah, Cost Model for Planning, Development and Operation of a Data Centerhttp://www.hpl.hp.com/techreports/2005/HPL-2005-107R1.html

19 7 March 2008 Chandrakant Patel, [email protected]

Compute Power Cooling

Flexible & Configurable Elements

Sensing Infrastructure

Policy based Control Engines, Tools

commodity flexible building blocks that enable dynamic change in configuration

compute, power, cooling resources provisioned based on the need

Smart IT IT-Facility End to End Management

T in at 25 °C, 27 °C ?

What is the operational limit?

20 7 March 2008 Chandrakant Patel, [email protected]

Dynamic Smart Cooling Architecture

AC Unit AC Unit

Equipment Racks

T

Room Chilled Water Supply

Ventilation Tiles

Modular Building Control

Floor Level Network (optional)

T

H H

VFDVFD

• CRAC temperature control sensors moved from return to supply side;

• Variable Frequency Drives added to AC units;

• Temperature sensors added to rack inlets;

• Advanced algorithms control operation of AC units.

T in

21 7 March 2008 Chandrakant Patel, [email protected]

HP Labs Smart Data CenterPalo Alto, California

CRAC 292 KW

CRAC 628 KW

CRAC 537 KW

CRAC 439 KW

CRAC 380kW

CRAC 194KW

Res

earc

h

RIT RIT

RIT

RITRIT

Res

earc

h

CRAC 1 CRAC 3

CRA

C 4

CRAC 2

CRA

C 6

CRA

C 5

PDU

PDUPDU

HD Area:

140 KW from 14 fully loaded racks in 400 square feet

HD Area

High Performance Cluster – low load in this state

22 7 March 2008 Chandrakant Patel, [email protected]

Conventional Mode

Dynamic Smart Cooling Mode

HP Labs Data Center

Palo Alto, CA

• 35% Energy Savings

• Improved reliability

23 7 March 2008 Chandrakant Patel, [email protected]

Local Disturbances – Vent Tile Blockage

50

55

60

65

70

75

80

85

90

11:0

0:17

11:1

7:17

11:3

4:17

11:5

1:17

12:0

8:17

12:2

5:41

12:4

2:41

12:5

9:41

13:1

6:41

13:3

3:41

13:5

0:41

14:0

7:41

14:2

4:41

14:4

1:41

14:5

8:41

15:1

5:41

15:3

2:41

15:4

9:41

16:0

6:44

16:2

4:00

16:4

1:00

16:5

8:00

17:1

5:22

17:3

2:22

17:4

9:22

18:0

6:25

18:2

3:26

18:4

0:27

18:5

7:28

19:1

4:32

19:3

1:33

19:4

8:34

Time

CR

AC

Sup

ply

Air

Tem

p (F

)

0

10

20

30

40

50

60

70

80

90

Rac

k B

4 In

let T

empe

ratu

re CRAC 1CRAC 2CRAC 3CRAC 4CRAC 5CRAC 6B4T1B4T2

September 7, 2006

DSC max allowable inlet temp

CRACs take evasive action

Tiles Blocked Blockage removed

Scenario: IT department performed maintenance on a rack and covered nearby vent tiles for 4 hours with bag. DSC responds and keeps impacted racks below maximum.

B4

24 7 March 2008 Chandrakant Patel, [email protected]

Global Disturbance – Chiller Failure

Scenario: Groundskeeper blows leaves into primary cooling tower during routine cleanup. Leaves block filter in cooling tower retarding condenser water flow that ultimately results in chiller failure. DSC delayed the impact of the failure allowing for repair.

0

10

20

30

40

50

60

70

80

10:0

0:05

10:2

2:35

10:4

5:05

11:0

7:35

11:3

0:05

11:5

2:35

12:1

5:05

12:3

7:35

13:0

0:05

13:2

2:35

13:4

5:05

14:0

7:35

14:3

0:05

14:5

2:35

15:1

5:05

15:3

7:35

16:0

0:05

16:2

2:35

16:4

5:05

17:0

7:35

17:3

0:05

17:5

2:35

18:1

5:05

18:3

7:35

19:0

0:05

19:2

2:40

19:4

5:10

20:0

8:08

20:3

0:38

20:5

3:08

21:1

5:38

21:3

8:08

Time

CR

AC

Sup

ply

Air

Tem

p (F

)

15

20

25

30

35

40

45

50

Rac

k In

let T

empe

ratu

re (C

) CRAC 1CRAC 2CRAC 3CRAC 4CRAC 5CRAC 6Rack E7Rack B4Rack A7

Primary Chiller Failure Chiller Back On

Maximum Allowable Inlet Temperature

CRAC 6 (DX)

Early reaction of DSC keeps temperatures under control until chilled water temperature reaches 74 F. Allows more time for resolution of problem.

25 7 March 2008 Chandrakant Patel, [email protected]

HP R&D Lab-Data Center Dynamic Smart Cooling Implementation in Bangalore

•Scope/Sizing

•Layout/Topology

•Comm. Fabric/Integration

•Control and Monitoring

•Management

• ServersNon-Stop serversProliant serversBlade serversCustom Enclosures

• Storage (XP/EVA)• Multiple Network topologies

• Sensor Network•7500 sensors

5 floors @14k sq. ft.900kW cooling per floor

•Chillers3 air-cooled2 water-cooled

•Pumps7 Primary5 Secondary

•CRAC units55 units

•Diesel Generators

Facility Building Blocks IT Building Blocks

26 7 March 2008 Chandrakant Patel, [email protected]

Supply Side and Demand Side Management

Demand Side (Consumption of Resources):• Library of flexible and scalable building blocks overlaid with

pervasive sensing and control to provision resources based on the need

Supply Side (Availability of Resources)• Devise a 2nd law based tool that enables a “cradle to cradle”

approach in quantifying and designing the IT ecosystem • Can it be used to:

− Dematerialize the ecosystem i.e. least materials design− Seek out appropriate materials− Seek out appropriate distributed energy sources− Seek out analytical techniques to address operations and “end of

life”• leverage the rich instrumentation used for demand side

management to detect and understand anomalies

manufacturing operationdelivery retirement time

27 7 March 2008 Chandrakant Patel, [email protected]

Dematerializing the EcosystemData Center Management: modeling, measurement and inference

Reduce redundancy, manage “end of life”:• Minimize redundancy in the data center facility infrastructure

− Example: empirical data and inference techniques to eliminate excessive redundancy e.g. standby air conditioners

• Hardware “Damage Boundary*” [1] and minimizing hardware redundancy− IT-Facility measurement and inference techniques in place, can we push the

limits of operation• Example: Understanding the impact inlet temperature Tin to life of components by

having thermo-mechanical models in place • Understanding thermo-mechanical behavior to determine root cause of failure and

manage “end of life”• Drive reliability studies have been presented before at USENIX using large

samples, SMASH data – opportunity to extend the work [2][3]

*Used for fragility assessment – can this be extended to other areas to determine operational limts[1] Lee Hedtke, Chandrakant Patel, Damage Boundary Testing of Hard Disk Drives, Proceedings of the ASME

Winter Annual Meeting, 1990[2] Pinheiro, Wolf-Dietrich Weber, Luiz André Barroso. Failure Trends in Large Disk Drive Population. In

Proceedings of the 5th USENIX Conference on File and Storage Technologies (FAST ’08), February 2007.[3] Weihang Jiang, Chongfeng Hu, Yuanyuan Zhou and Arkady Kanevsky, Are Disks the Dominant

Contributor for Storage Failures, 6th USENIX-FAST, February 2008

retirementoperationmanufacturing

28 7 March 2008 Chandrakant Patel, [email protected]

Rigid Drive Mechanism (Thermal)

Tin, °C

Tdrive,in

Tdrive core

Tdrive core,°C

Qdrive spindle

retirementoperationmanufacturing

29 7 March 2008 Chandrakant Patel, [email protected]

Thermo-Volume ResistanceExpedient Thermo-Fluids Modeling of Systems

“Thermo-Volume” [Ref] Resistance

V', m3/s

Thermal Resistance

System Blower (s)

Extract the temperature at a given location

e.g. chip, hard drive temperature

Tchip

vbypass

v

VolumeResistance

2v21v

21

ductduct

tduct

duct

ldf

df

xP ρρ +=∂∂

x•Patel and Belady, ECTC1997

•Bash and Patel, ISPS1999

•Patel and Bash, Thermal Management Course Notes, 2007

Epoxy Glass Printed Circuit Board

Can we represent sub-systems as “thermo-volumes” with thermo-mechanical attributes?

30 7 March 2008 Chandrakant Patel, [email protected]

Rigid Drive Mechanism (Mechanical)

Susceptibility to vibration:

• over the years, stiffness to mass ratio has gone up due to miniaturization

•however, in-situ disturbances in data centers can coincide with natural frequency of the arm

retirement

31 7 March 2008 Chandrakant Patel, [email protected]

Dematerializing the EcosystemLifetime Exergy Advisor - Case Study of Sample PC

Ref: HP Labs – UC Berkeley data

manufacturing

Extraction Data

32 7 March 2008 Chandrakant Patel, [email protected]

Joules: Currency of the Flat Worldecosystem: billions of handhelds and printers, thousands of data centers and print factories

Cradle to Cradle Design: Least Material, Appropriate materials based on the 2nd Law of Thermodynamics

Least Energy through need based provisioning of resources

Sustainable IT Ecosystem0

10

20

30

40

50

60

70

80

10:0

0:05

10:2

2:35

10:4

5:05

11:0

7:35

11:3

0:05

11:5

2:35

12:1

5:05

12:3

7:35

13:0

0:05

13:2

2:35

13:4

5:05

14:0

7:35

14:3

0:05

14:5

2:35

15:1

5:05

15:3

7:35

16:0

0:05

16:2

2:35

16:4

5:05

17:0

7:35

17:3

0:05

17:5

2:35

18:1

5:05

18:3

7:35

19:0

0:05

19:2

2:40

19:4

5:10

20:0

8:08

20:3

0:38

20:5

3:08

21:1

5:38

21:3

8:08

Time

CR

AC

Sup

ply

Air

Tem

p (F

)

15

20

25

30

35

40

45

50

Rac

k In

let T

empe

ratu

re (C

) CRAC 1CRAC 2CRAC 3CRAC 4CRAC 5CRAC 6Rack E7Rack B4Rack A7

Primary Chiller Failure Chiller Back On

Maximum Allowable Inlet Temperature

CRAC 6 (DX)

Early reaction of DSC keeps temperatures under control until chilled water temperature reaches 74 F. Allows more time for resolution of problem.

Opportunity for collaboration CS-ME-EE

33 7 March 2008 Chandrakant Patel, [email protected]

Thank You

• I thank the USENIX-FAST organizers for giving me this opportunity

• I thank my HPL colleagues &• I thank you for your time