eliminating data center hot spots
Post on 19-Nov-2014
2.554 Views
Preview:
DESCRIPTION
TRANSCRIPT
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Eliminating Data Center Hot Spots: An Approach for Identifying and Correcting Lost Air
December 5, 2007
Presented By:
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Speakers and Sponsor
• Lars Strong, P.E. for Upsite Technologies – Lead Engineer and Services Product Manager – Consulted numerous Data Centers internationally on
infrastructure design and management structure– Robust experience with fluid dynamics and
thermodynamics of data center cooling infrastructure
• Patrick Cameron, Director of Business Development, DirectNET
– Product manager for DirectNET’s suite of data center infrastructure products.
– Afcom information session presenter and SME for remote management solutions.
– Ten years of consulting experience designing and implementing custom hardware and software solutions.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Agenda
• Energy Trends• Measuring Data Center Performance• Leading Causes of Inefficiency • Methods for Improvement• Recommendations in Action: Case Study
Review • Q&A
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
√ Server density has increased significantly over the past decade
√ The average server’s power consumption has quadrupled
√ Higher density and the resultant higher operating temperatures spawn increased administration costs
√ Executives are starting to look more closely at the energy budgets associated with IT infrastructure
√ Customers are running out of power and cooling capacity well before they reach the spatial limits of their facilities
A Snapshot: Energy Trends
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Source: “Data center cooling strategies”, HP, August 2007
Energy Trends: Specific to Cooling
• According to HP, in 85 percent of data centers, most of the non-IT power is used by the cooling resources
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Data Center Power Flow
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Data Center Coefficient of Efficiency (CoE)
• CoE = total power / critical power
– Critical power is computer communication equipment consumption
• Sum of PDU loads
– Total Power is that required to support both UPS and Mechanical systems
• CoE = Building service entrance usage / sum of PDU loads (works best for standalone data centers)
Ideal CoE 1.6
Target CoE 2.0
Typical CoE 2.4 to 2.8 and higher
Many >3.0
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Tier Performance Standards
• Tier I: Basic Site Infrastructure– Room dedicated to support IT equipment
• Tier II: Redundant Capacity Components Site Infrastructure– Redundant components for increased reliability
• Tier III: Concurrently Maintainable Site Infrastructure– Alternate distribution paths, one active
• Tier IV: Fault Tolerant Site Infrastructure– Dual active distribution paths
• Tiers I & II: Tactical solutions• Tiers III & IV: Strategic Investments
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Coefficient of Efficiency (CoE)
• Interesting revelations– At a Coe of 2.0 it takes twice the “critical power” to
operate even an efficient data center– When CoE gets above 2.4 most of the additional
power is going into inefficient mechanical systems– As the CoE increases the environment in the
computer room can deteriorate– Adding more cooling units increases CoE and may
not reduce Hotspots
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
What Leads to Inefficient CoE Sources of Mechanical Inefficiencies
Mismatched Expectations
Mismatched Architectures
No Master Plan
Failure to Measure and Monitor
Failure to Use Best Practices
Thermal Incapacity and Excessive
Bypass Airflow
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Thermal Incapacity Defined
• Thermal incapacity is the portion of the mechanical system that is running, but not contributing to a dry bulb temperature change because of return air temperatures, system configuration problems, or other factors
• Most thermal incapacity can be inexpensively recovered by a mechanical system “tune-up”
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Bypass Airflow: Defined
• Conditioned air is not getting to the air intakes of computer equipment– Escaping through cable cutouts and holes
under cabinets– Escaping through misplaced perforated tiles– Escaping through holes in computer room
perimeter walls, ceiling, or floor
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
White Paper
• A comprehensive survey of actual cooling conditions in 19 computer rooms comprising 204,400 ft2 of raised floor.– Size from 2,500 square feet
(2,500 ft2 or 230 m2) to 26,000 ft2 (2,400 m2)
• More than 15,000 individual pieces of data were collected.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Consequences of Thermal Incapacity
• Inefficient cooling system– Operating cooling capacity is 2.6 times the
critical load (UPS output) – At Coefficients of Efficiency of 2.0 – 2.4– 10% of the racks had “hotspots” at the intake
air exceeding 77°F (25 °C)
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Consequences of Thermal Incapacity
• Inefficient cooling system (cont.)– Rooms with the greatest excess of cooling
capacity had the worst environment– At Coefficients of Efficiency > 3.0– Up to 25% of the racks had “hotspots”
– More cooling capacity • Poorer environment• Wasting capital and operating expenses
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Not Limited to High-Density Clusters • Study done by Uptime Institute
found that the highest % of hot spots were found in computer rooms with very light loads.
• Between 3.2 and 14.7 times more cooling capacity was running in those rooms than was required.
• 60% of the cold air cools the room but not the critical load except by recirculation
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
How Can So Much Excess Capacity be Installed? • Historically data center managers have
relied on vendors and contractors– Vendors are motivated to sell more equipment– Contractors are motivated to perform
installations
• Ignorance of science behind cooling and capacity management
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
The Culprit: Airflow Management
Three categories of air movement challenges
• Below floor obstruction
•Cables, pipes, etc.
• Raised floor performance
•Cable openings, perforated tile placement, etc.
• Above floor circulation
•Cabinet layout, cooling unit orientation, ceiling height, etc.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Deciphering Hot Spots: Zone vs. Vertical
• Two Varieties of hotspots– Zone hotspots typically exist over large areas of raised floor– Vertical hotspots are more discrete and may exist just at the top
few U of an isolated cabinet
In either case , exceeding 77 F with a relative humidity of less than 40% are serious threats to maximum
information availability and hardware reliability.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Raised-Floor Utilization: Legacy Layout
• All aisles have elevated “mixed” temperature (starved supply airflow compounds problem)
• Fails to deliver predictable air intake temperatures• Reduces return air temperature which reduces cooling unit capacity
and removes moisture• Removed moisture must be reinserted into the computer room
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
• Cold air escapes through cable cutouts• Escaping cold air reduces static pressure resulting in insufficient
cold aisle airflow• Result is vertical and zone hotspots in high heat load areas
Computer Room Layout Options: The Effect of Bypass Airflow
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Cold/Hot Aisle–Ideal Implementation: No Bypass Airflow
• Average power per rack (assuming one perforated tile per rack and 15°F temperature drop across the cooling unit coil)
– 3.3 kW per perforated tile (700 CFM)
– 6.6 kW per grate (1,400 CFM)
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Sealing Options Need to be Evaluated for:
• Sealing effectiveness• Self sealing (is labor required)• Ease of recabling (is labor required)• Dresses raw edges (NFPA 75 requirement)• Static dissipative• Install it and forget it (is policing required)• Does not contribute to contamination
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
A Case Study #1: Success
• Business: Major carmaker with 10,000 ft2 data center.
• Computing needs: Support of all North American operations, sales and corporate functions.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Case Study #1: Success
Problem Statement: • IT equipment reliability problems due to high intake
temperatures• Failure rates were so high that IT equipment
manufacturers were threatening to void warranties and charge for all service calls, a potentially very costly situation
• No redundant cooling capacity
Thermal Incapacity and Bypass Airflow Issues:• Unsealed cable openings wasting 43% of conditioned air
volume
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
A Case Study #1: Success
Solution Approach: Comprehensive remediation• Comprehensive evaluation of the computer room’s cooling health• Adjustment of cooling infrastructure
– Sealing bypass openings– Perforated tile location and number– Cooling unit set points and calibration
• No downtime or exposure to downtime from: construction activities, adjustment of computer room layout, or the purchase of additional cooling units or perforated tiles
Results:• All IT equipment air-intake temperatures brought within
recommended range• Maximum 16°F drop occurred at critical enterprise servers. • Bypass airflow reduced from 43% to less than 10%
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Case Study #1: Success
Business Benefit:• Increase in the cooling capacity of the existing CRAC
units • Cooling capacity to support growth• There was also the side benefit of the noise level
dropping significantly• “Decreasing the operating temperatures in hotspot areas
improves our equipment reliability, decreases outages, and helps us meet our business continuity goals”, quote from customer.
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
A Case Study #2: Failure
• After rearrangement of 30 perforated tiles, 250 servers automatically thermaled off
• Internal safety controls in hardware turn off to prevent overheating
• Result: Internet service for critical application service provider halted during prime time
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
How to get started…
KoldWorks Cooling Services– KoldProfile—Cooling Assessment
– KoldSeminar—Education & Profile
– KoldCheck—Cooling Audit
– KoldTune—Cooling Remediation
KoldLok Raised-Floor Grommets
KoldLokIntegral
KoldLokSurface
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Cooling Tools
• Temperature Strip
• TroubleShooter – Test to see if there’s poor or
good airflow of conditioned air over holes in your perforated raised floor tiles
Receive a Complimentary Strip or Troubleshooter:
Rebecca.mccue@directnet.us
DirectNET Confidential © 2007 DirectNET, Inc. All rights reserved.
Q&A
To Arrange a Complimentary 15Minute Cooling Evaluation
Rebecca.mccue@directnet.us
To Receive a copy of the Presentation
Rebecca.mccue@directnet.us
top related