Download - DC Power Architecture
-
8/14/2019 DC Power Architecture
1/6
ICSET 2008
AbstractThe power quality (PQ) disturbances e.g. transient
voltages, voltage distortion, voltage sags and swells, over
voltages and under voltages, and voltage interruption are
caused of critical electronic component failures, resets, short
lifetimes and cascading failures to a whole data center system
operation failures. The data center operation downtimes may
costs a million dollar per hour. The extensive international
standards, TIA-942, IEEE-493, IEEE-446, IEEE-1100, and IEC
620040-3, recommend through fault tolerant designs to protect
against the single point of failure (SPoF) throughout data
center power distribution systems (DC-PDS). A new
generalized approach is given to illustrate a better model to
protect a cleaning on power quality and SPoF. This research
proposes a new model of the optimum availability and
investment tradeoffs for data center conceptual design and
spectrum investigation with risk assessment of DC-PDS.
I. INTRODUCTION
HE natural disaster and human made are the original
sources of power disturbances. The consequence costs
of damage is not only costs for replacing equipment and
labor costs for fixing the problems but also reflects costs of
system downtime and reputation for organization. Gardner
group is presented the costs of brokerage operation
downtime per hour around $US 6.48Million [3]. However,
the costs of reputation and business confidence may not be
evaluated in number. Power quality disturbances come from
many sources e.g. lightning surge, surge from non-arcing
electrostatic discharges (ESD), non-linear equipment.
Moreover, they have many type forms of power quality
condition e.g. under voltage, over voltage, transient voltage,
and voltage distortion [1]. When developing the criteria for power quality protection, it is critical to consider the high-
frequency phenomena of a lightning and ESD. Wiring and
grounding practices for the special construction, data center
(DC), requires a serious risk to damage prevention.
DC is unique and complex in power infrastructure
systems which are tough and take-time to repair. It is
important to understand the effect of the power disturbances
on data center equipment and processes to resume system
back to normal operation. A process interruption caused by
power outage or transient voltage may require a complete
restart or repair components that impact time to repair (TTR)
or mean time to repair (MTTR) [10], [12], [15]. The more
obvious consequence is on data center system availability
for services or productions. The downtime cost models ofdata center are present by many researches [3], [9], [11].
Manuscript received July 15, 2008. Montri Wiboonrat is a Ph.D.candidate of Graduate School of Information Technology: Computer andEngineering Management, Assumption University, Bangkok, Thailand.([email protected]).
Data center power distribution system (DC-PDS) is
modeled to optimize objective functions between downtime
costs and investment devices, operation, and energy
consumption. The past data center or static planning, before
the millennium, is considered only a single planning period
according to technologies and at point demands. New
design, or after TIA 942-2005, dynamic planning is
concentrated on optimization, efficiency, and utilization of
power effectiveness, space, reliability/ availability and
investment.
Many standards are contributed to support DC-PDS
design model e.g. TIA 942-2005, IEEE 446-1995, IEEE
493-2007, IEEE 1100-1999, IEC 62040-3-1999, ASHRAE,
EN 1047-2. DC-PDS is widely practiced ad hoc method
involving the internal and external constraints of each
organization. Risk acceptance of each business is varying bydowntime cost model [9], [11]. For example, banking
service requires highest reliability, 99.9999% availability, of
data center or close to zero downtime. Gas & Oil production
plant may be able to stop operation data center a few hour
per year for overall maintenance systems. Increasing a level
of higher reliability/ availability means an increase in the
investment of acquisition. This investment needs to be
balanced with the cost of downtime and business reputation
[11], [13].
In this paper, researchers present a risk anatomy, which
can help data center designers or operators to identify the
single point of failure (SPoF) of DC-PDS and how to
improve power reliability with optimal investment on the
level of risk acceptance. Moreover, this research isintegrated and applied the international standards [5], [6],
[7], [8], [16] as a basis for minimum requirements. Risk
zone assessment model of DC-PDS is performed of power
distribution reliability to incorporate into overall objectives
function via downtime costs against with investment,
operation, and efficiency.
II. DOWN TIME COST MODEL
Determine the company costs of outage are not the onlyones that lose revenue but also the loss to a company ofwasting the time of employees who cannot get their workdone during an outage. The loss of availability of data centerdirectly affects the facility infrastructures bottom line since
it takes a day to a week to get full recovery after a short-lived unplanned downtime. The two major factors affected
by downtime cost depend on power outage frequency andduration occurrences.
Businesses losses will justify the investment cost of datacenter Tier availability. Estimation of business losses per
Risk Anatomy of Data Center Power Distribution Systems
Montri Wiboonrat,Assumption University, Bangkok, Thailand
T
674
978-1-4244-1888-6/08/$25.00 c 2008 IEEE
Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.
-
8/14/2019 DC Power Architecture
2/6
hour should be compensated by forward of investment costthat can gain by return of investment (ROI) model shown asfollows [2], [11].
CostsInvestment
CostsInvestmentCostsDowntimeBenifitsROI
=
)(
Reputation (R ) and Goodwill (G ) will be the hardestfactors that is difficult to calculate, subjective, to be moneyvalues. It is depended on business segment and customergroup impacted, as shown in (1).
: Frequency of interruption (occurrence per year)
t : Duration of interruption (at least an hour peroccurrence; integer number)
L : Cost of business lost per hour of occurrence(Estimated average costs of an hour of down time )
R : Business losses in term of reputation and businessaccountability (Subjective)
G : Lost reliable relation with partners and suppliers(Goodwill-Subjective)
I : Cost tradeoffs during system reliability increasingI could be a vastly variation subject to component
brands, components inherent characteristics (CIC), andsystem connectivity topology (SCT). Data center sitereliability/ availability is depended on the details ofcomponent selection (CIC), system connectivity topology(SCT) e.g. series-parallel, k-out of-n, bridge, and active-standby mode [13], [14], [15]. Avi : Increasing system availability of Tier.The correlation of data center investment and availability
illustrates in Fig. 1. The Optimal DC availability rangediffers from business to business subject to levels of (1) and
)( ),( tiLf acceptant losses [11]. However, the simulation
result shown high investment will not gain high availability
beyond the inverse availability point, as depicted in Fig. 1.
Tier I Tier II Tier III Tier IV - IV+
Optimal Availability
RangeUnder Availability
Inverse
Availability
Optimal Availability
Point
Unavailability Cost
Availability Levels
Inverse
Availability
Point
Fig. 1. Optimum availability and investment tradeoffs [11]
L = (Employees cost/hour * Employees affected byoutage) + (Avg. Rev./hour * Rev. affected by outage)+(Replaced or changed equipment costs + resume labor
hours) + =
n
i 1
(Avg. Lawsuits/ hour* No. of Contract(i)) +
(Business- Reputation lost to customers: Subjective) + (Lossof Goodwill to partners and suppliers: Subjective).
Lost revenue per hour will differ from business to business, e.g. Brokerage operation $US6.45M, Credit cardauthorization $US2.6M, Ebay $US225K, Amazon.com$US180K, Cellular service activation $US41K, and ATMservice fees $US14K [3].
The concern factors are depended on rationale tradeoffawareness, as shown in (1), of each business typerequirements. The optimal point consideration of data centersite availability and investment costs derived from theslopeas Fig. 1 together with the result from ROI.
ILtGR ++ )..( (1)
Business lost is not only depended on type of business butalso depends on time as seen in Figure 2. The relation for
business lost and ongoing time will be exponentialcorrelation as shown in (2). Example of international bankoperates by time zone: Starting Point from Japan toAustralia, Hong Kong, Singapore, and Thailand. Thetransactions between each country will transfer overlap bytime zone. Thus, the size effected, transactions from Japan toAustralia will start fist follow by Japan to Thailand, Japan toEngland and so on, of data center downtime will accumulateand increasing damage as a chain reaction as depicted in Fig.
2, accumulation function )( ),( tiLf .
Assumption each down time starting by 1 andtequal or greater than 1 hour(s)
Fig. 2. Time dependency accumulation losses [11]
11
1
=
=
teLLft
ii
ti ,.)()(
),(
(2)
),( tiL : Time dependency accumulation losses.
675
Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.
-
8/14/2019 DC Power Architecture
3/6
III. R ELIABILITY ASSUMPTION MODEL
A. Tier IV Data Center Model
TIA 942- Tier IV data center is defined as a pre-model of
fault tolerance for risk assessment from utility incoming
throughout loaded points, as depicted in Fig. 3.
Fig. 3. Tier IV Data Center Diagram [16]
B. IEEE 493-2007, 2(N+1) Model
To enhance the critical prevention sources, UPS, a system
requires one out of N components. The design is shown the
parallel power supplies to critical loaded from 2(N+1)
separated and independent operation UPS with STS and
manual bypass. An annual availability of 2(N+1) is equal to
99.99914% or probability of failure 16.49% during 5 years,
as depicted in Fig. 4.
Fig. 4. IEEE 493, 2(N+1) Power Equipment [7]
C. Fault Tolerance DC-PDS Model
Fault tolerance topology is the objective design to
eliminate a single point of failure (SPoF) from DC-PDS.
Design for cleaning power quality is mitigated by applying a
power conditioning technology, as depicted in Fig. 5. Zero
downtime is the main proposes of data center operation.
System maintenance without interrupting operation is
defined not only extended equipment life but also prevent
equipment failure before MTTF.
IV. POWERQUALITY ZONE ASSESSMENT
The researcher proposes fault tolerance analysis approachmodel.
A. High Voltage: Zone 0
TIA 942- Tier IV, 99.995% uptime, is defined utility grids
supporting for this model are independent each other. With
the second utility grid 95% of power quality (PQ) problems
can be avoided. Reliability of PQ is different from location
to location and country to country. Especially, when
compare between developing country and developed
country. According to [9], Table I, a research is shown the
reliability of PQ is only 99.74924% that means the
downtime per year equal to 21.96657 hours. The gap
between 99.995% requirement and real life, PQ, 99.74924%
is called risk acceptance. Natural disaster causes poweroutage that is uncontrollable and unpredictable.
TABLE IPOWERQUALITY DISRUPTIONS [9]
B. Low Voltage: Zone I, Main Distribution Board
Transformers, diesel engines, and ATSs are defined as
critical components on this zone because the lowest
reliability equipment is represented the lowest reliability of
system. Diesel engine is the weakest MTBF on this model
[7]. Since, diesel engine is the highest failure rate. Design to
eliminate risk, the reliability, requires parallel system,
2(N+1), to ensure the existing of power system. The rest of
equipment is design for 2N parallel, A Side and B Side, as
shown in Fig. 5-Zone I.
C. Low Voltage: Zone II, Uninterruptible Power Unit
This Zone II can define as mission critical operation for
data center because the fist stage of power outage UPSs will
continuous supply power to loads immediately [10], [12].UPS 2(N+1) is proposed to reduce reliability risk. Rid-
through for power outages up to about 500 ms, this can
handle by flywheel for 15-20 seconds on A Side. If longer
more than flywheel can handle, UPS + batteries on B Side
are still keeping recharging to loads.
676
Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.
-
8/14/2019 DC Power Architecture
4/6
Fig. 5. A New Fault Tolerance DC-PDS Model
677
Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.
-
8/14/2019 DC Power Architecture
5/6
As the same time, during 15-20 seconds diesel engines
are already standby to provide emergency load back to
support flywheel and UPS+battery. Bypass isolation
transformer with STS is design for assurance reliability
during on maintenance UPSs. It is provided the cleaning on
power quality to down-steam, as shown in Fig. 5-Zone II of
each A Side and B Side. A new design improvement toreduce MTTR is all type of circuit breakers are drawn-out
model. Risk on PQ could be generated on this zone by non-
linear equipments [4], [6]. However, prevention procedure is
done through features of UPS and isolation transformer on
Zone III before it passes through critical loads.
D. Low Voltage, Zone III, Power Disturbance SafeOperating Zone (PD-SOZ)
Complex failure propagation across the power systems
need to coordinate among circuit breakers under the large
centralized power systems. The solution for design is to
collocate 2(N+1) UPS to supply separated independent, A
Side and B Side, to the loads, as shown in Fig. 5- Zone III.
Isolating transformer is applied to this zone not only to
reduce of both the imbalance and the third harmonic of non-
liner loads but also reduce of system electrical noise and
increase in the power factor for a non-liner load. The result
of parallel design for 2N of power distribution, A Side and
B Side, compares to IEEE 493 data sheet on Table 8-1,
page-194, shown the system MTBF equal to 188,654.5
hours, MTTR equal to 1.64 hours, availability equal to
99.99913%, and probability of failure during 5 years equal
to 16.16% [7].
V. DISCUSSION
A Fig. 6 shows the distribution of sags and outages per
site per year. A weight for consideration to invent to protect
critical equipment needs to analysis from PQ history of data
center site location. If a several year record presents 50%
more frequency on interruption lest than 10 seconds, the
investment on UPSs, flywheel UPSs or UPSs plus batteries,
can be effectiveness. Other, if record presents 20-50% more
frequency on interruption more than 10 minutes, the
investment on diesel generators, N+1, can be effectiveness.
The others case can be balancing between equal investment
on UPSs and diesel generators. Researcher is recommended
for voltage frequency independent (VFI) triple
Classification 1 rating UPS type to improve the system
efficiency [4], [5].
In order to obtain the level of business continuous
availability, the system requires the prevention processes for
critical loaded points. The necessary processes need to take
into consideration as follows:1) Operators require a comprehensive training on existing
system design, power distribution system layout,
common problems and solutions. These activities are
preventing the manmade by commission and omission
during daily operations and regularly maintenances.
2) On the beginning design process, consultants or
designers need to consider to the international standards,
latest equipment technologies and confirm that
technologies are mature on operations and maintenance
procedures. The high reliability (MTTF) and correct
sizing of selected equipment are prevented, short life
operation period, overloaded current (trip), energy
effectiveness, optimal investment, and maintenancecosts, as a perfect synergy.
3) Contingency plans are required to institute to prevent
some occurrences of national disasters that are
unpredictable and uncontrollable situations.
Fig. 6. Distribution of Sags and Outages per Site per Year [9]
The relation for downtime cost model and reliability
model is called optimum availability and investment
tradeoffs that designers and investors need to discuss what
the point of enough availability with constrained investment
can achieve. The consideration shall satisfy (1). There is not
only investment and data center availability needed to
concern but also downtime of data center can destroy
business as well [12].
VI. CONCLUSION
The next generation of data center power distribution
system planning is required to satisfy the growing and
changing system loaded demand during the planning period
and critical operation under concepts of safety, reliability,
consistency, dependability, optimization, utilization,
efficiency and regulations. Risk analysis of data center
power distribution system is needed to understand the nature
of equipment function/ stage failures for preventive and
corrective actions. A planed system downtime is much better
than an unplanned system downtime.
REFERENCES
[1] G. O. Young, Synthetic structure of industrial plastics (Book stylewith paper title and editor), inPlastics, 2nd ed. vol. 3, J. Peters, Ed.New York: McGraw-Hill, 1964, pp. 1564.
[2] A. Bendre, D. Divan, W. Kranz, and W. Brumsickle, EquipmentFailures Caused by Power Quality Disturbances, 39th IAS AnnualMeeting, Industry Application Conference, Vol.1, 3-7 Oct. 2004, pp.482-489.
[3] B. Boehm, L. Huang, A. Jain, and R. Madachy, The ROI of SoftwareDependability The iDAVE Model, IEEE Software, May/June 2004.
pp. 54-61.
678
Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.
-
8/14/2019 DC Power Architecture
6/6
[4] K. Davidson, K. Darrow, T. Bryson, and B. Major, AdvancedMicroturbine System (AMTS) Market Study, prepared for DOE andCapstone Turbine Corporation, prepared by Onsite EnergyCorporation, April, 2001.
[5] W. Solter, A New International UPS Classification by IEC 62040-3, 24th Annual International Telecommunications EnergyConference, 2002, INTELEC 2002, pp. 541-545.
[6] IEC 62040-3 ED. 1.0 B: 1999, Uninterruptible power systems (UPS)
Part 3: Method of specifying the performance and test requirements,1999.
[7] IEEE Std 446-1995, (Revision of IEEE Std 446-1987), IEEERecommended Practice for Emergency and Standby Power Systemsfor Industrial and Commercial Applications, 12 December 1995.
[8] IEEE Std 493-2007, (Revision of IEEE 493-1997), RecommendedPractice for Design of Reliable Industrial and Commercial PowerSystem, Gold Book, 7 February 2007.
[9] IEEE Std 1100-1999, (Revision of IEEE Std 1100-1992), IEEERecommendation Practice for Powering and Grounding ElectronicEquipment, 22 March 1999.
[10] K. Darrow, and B. Hedman, The Role of Distributed Generation inPower Quality and Reliability, New York State Energy Research andDevelopment Authority, December 2005.
[11] M. Wiboonrat, An Empirical Study on Data Center System FailureDiagnosis, 3rd International Conference on Internet Monitoring andProtection, IEEE ICIMP 2008, Romania, June 29-July 5, 2008,accepted for publication.
[12] M. Wiboonrat, An Optimal Data Center Availability and InvestmentTrade-Offs, 9th International Conference on Software Engineering,Artificial Intelligence, Networking, and Parallel/ DistributedComputing, IEEE SNPD 2008, Thailand, August 6-8, 2008, acceptedfor publication.
[13] M. Wiboonrat, Dependability Analysis of Data Center Tier III, 13th
International Telecommunications Network Strategy and PlanningSymposium, NETWORKS 2008, Budapest, Hungary, Sept 28- Oct 2,2008, accepted for publication.
[14] M. Wiboonrat, Power Reliability and Cost Trade-Offs: AComparative Evaluation between Tier III and Tier IV Data Centers,Power Conversion and Power Management, Digital Power Forum2007, San Francisco, CA, September 10-12, 2007.
[15] M. Wiboonrat, Beyond Data Center Tier IV ReliabilityEnhancement, Power Conversion and Power Management, DigitalPower Europe 2007, Munich, Germany, November 13-15, 2007.
[16] M. Wiboonrat, and C. Jungthirapanich, Reliability Enhancement viathe Failure Modes, Effects, and Criticality Analysis (FMECA) and the
Reliability Block Diagram (RBD), 8 th International Conference onOpers. & Quant. Management, ICOQM 2007, Bangkok, Thailand,October 17-20, 2007.
[17] Turner IV, W. P., J. H. Seader, V. Renaud, and K. G. Brill, TierClassification Define Site Infrastructure Performance, White Paper,The Uptime Institute, Inc. 2008.
679