dc power architecture

Upload: darveniza

Post on 30-May-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 DC Power Architecture

    1/6

    ICSET 2008

    AbstractThe power quality (PQ) disturbances e.g. transient

    voltages, voltage distortion, voltage sags and swells, over

    voltages and under voltages, and voltage interruption are

    caused of critical electronic component failures, resets, short

    lifetimes and cascading failures to a whole data center system

    operation failures. The data center operation downtimes may

    costs a million dollar per hour. The extensive international

    standards, TIA-942, IEEE-493, IEEE-446, IEEE-1100, and IEC

    620040-3, recommend through fault tolerant designs to protect

    against the single point of failure (SPoF) throughout data

    center power distribution systems (DC-PDS). A new

    generalized approach is given to illustrate a better model to

    protect a cleaning on power quality and SPoF. This research

    proposes a new model of the optimum availability and

    investment tradeoffs for data center conceptual design and

    spectrum investigation with risk assessment of DC-PDS.

    I. INTRODUCTION

    HE natural disaster and human made are the original

    sources of power disturbances. The consequence costs

    of damage is not only costs for replacing equipment and

    labor costs for fixing the problems but also reflects costs of

    system downtime and reputation for organization. Gardner

    group is presented the costs of brokerage operation

    downtime per hour around $US 6.48Million [3]. However,

    the costs of reputation and business confidence may not be

    evaluated in number. Power quality disturbances come from

    many sources e.g. lightning surge, surge from non-arcing

    electrostatic discharges (ESD), non-linear equipment.

    Moreover, they have many type forms of power quality

    condition e.g. under voltage, over voltage, transient voltage,

    and voltage distortion [1]. When developing the criteria for power quality protection, it is critical to consider the high-

    frequency phenomena of a lightning and ESD. Wiring and

    grounding practices for the special construction, data center

    (DC), requires a serious risk to damage prevention.

    DC is unique and complex in power infrastructure

    systems which are tough and take-time to repair. It is

    important to understand the effect of the power disturbances

    on data center equipment and processes to resume system

    back to normal operation. A process interruption caused by

    power outage or transient voltage may require a complete

    restart or repair components that impact time to repair (TTR)

    or mean time to repair (MTTR) [10], [12], [15]. The more

    obvious consequence is on data center system availability

    for services or productions. The downtime cost models ofdata center are present by many researches [3], [9], [11].

    Manuscript received July 15, 2008. Montri Wiboonrat is a Ph.D.candidate of Graduate School of Information Technology: Computer andEngineering Management, Assumption University, Bangkok, Thailand.([email protected]).

    Data center power distribution system (DC-PDS) is

    modeled to optimize objective functions between downtime

    costs and investment devices, operation, and energy

    consumption. The past data center or static planning, before

    the millennium, is considered only a single planning period

    according to technologies and at point demands. New

    design, or after TIA 942-2005, dynamic planning is

    concentrated on optimization, efficiency, and utilization of

    power effectiveness, space, reliability/ availability and

    investment.

    Many standards are contributed to support DC-PDS

    design model e.g. TIA 942-2005, IEEE 446-1995, IEEE

    493-2007, IEEE 1100-1999, IEC 62040-3-1999, ASHRAE,

    EN 1047-2. DC-PDS is widely practiced ad hoc method

    involving the internal and external constraints of each

    organization. Risk acceptance of each business is varying bydowntime cost model [9], [11]. For example, banking

    service requires highest reliability, 99.9999% availability, of

    data center or close to zero downtime. Gas & Oil production

    plant may be able to stop operation data center a few hour

    per year for overall maintenance systems. Increasing a level

    of higher reliability/ availability means an increase in the

    investment of acquisition. This investment needs to be

    balanced with the cost of downtime and business reputation

    [11], [13].

    In this paper, researchers present a risk anatomy, which

    can help data center designers or operators to identify the

    single point of failure (SPoF) of DC-PDS and how to

    improve power reliability with optimal investment on the

    level of risk acceptance. Moreover, this research isintegrated and applied the international standards [5], [6],

    [7], [8], [16] as a basis for minimum requirements. Risk

    zone assessment model of DC-PDS is performed of power

    distribution reliability to incorporate into overall objectives

    function via downtime costs against with investment,

    operation, and efficiency.

    II. DOWN TIME COST MODEL

    Determine the company costs of outage are not the onlyones that lose revenue but also the loss to a company ofwasting the time of employees who cannot get their workdone during an outage. The loss of availability of data centerdirectly affects the facility infrastructures bottom line since

    it takes a day to a week to get full recovery after a short-lived unplanned downtime. The two major factors affected

    by downtime cost depend on power outage frequency andduration occurrences.

    Businesses losses will justify the investment cost of datacenter Tier availability. Estimation of business losses per

    Risk Anatomy of Data Center Power Distribution Systems

    Montri Wiboonrat,Assumption University, Bangkok, Thailand

    T

    674

    978-1-4244-1888-6/08/$25.00 c 2008 IEEE

    Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 DC Power Architecture

    2/6

    hour should be compensated by forward of investment costthat can gain by return of investment (ROI) model shown asfollows [2], [11].

    CostsInvestment

    CostsInvestmentCostsDowntimeBenifitsROI

    =

    )(

    Reputation (R ) and Goodwill (G ) will be the hardestfactors that is difficult to calculate, subjective, to be moneyvalues. It is depended on business segment and customergroup impacted, as shown in (1).

    : Frequency of interruption (occurrence per year)

    t : Duration of interruption (at least an hour peroccurrence; integer number)

    L : Cost of business lost per hour of occurrence(Estimated average costs of an hour of down time )

    R : Business losses in term of reputation and businessaccountability (Subjective)

    G : Lost reliable relation with partners and suppliers(Goodwill-Subjective)

    I : Cost tradeoffs during system reliability increasingI could be a vastly variation subject to component

    brands, components inherent characteristics (CIC), andsystem connectivity topology (SCT). Data center sitereliability/ availability is depended on the details ofcomponent selection (CIC), system connectivity topology(SCT) e.g. series-parallel, k-out of-n, bridge, and active-standby mode [13], [14], [15]. Avi : Increasing system availability of Tier.The correlation of data center investment and availability

    illustrates in Fig. 1. The Optimal DC availability rangediffers from business to business subject to levels of (1) and

    )( ),( tiLf acceptant losses [11]. However, the simulation

    result shown high investment will not gain high availability

    beyond the inverse availability point, as depicted in Fig. 1.

    Tier I Tier II Tier III Tier IV - IV+

    Optimal Availability

    RangeUnder Availability

    Inverse

    Availability

    Optimal Availability

    Point

    Unavailability Cost

    Availability Levels

    Inverse

    Availability

    Point

    Fig. 1. Optimum availability and investment tradeoffs [11]

    L = (Employees cost/hour * Employees affected byoutage) + (Avg. Rev./hour * Rev. affected by outage)+(Replaced or changed equipment costs + resume labor

    hours) + =

    n

    i 1

    (Avg. Lawsuits/ hour* No. of Contract(i)) +

    (Business- Reputation lost to customers: Subjective) + (Lossof Goodwill to partners and suppliers: Subjective).

    Lost revenue per hour will differ from business to business, e.g. Brokerage operation $US6.45M, Credit cardauthorization $US2.6M, Ebay $US225K, Amazon.com$US180K, Cellular service activation $US41K, and ATMservice fees $US14K [3].

    The concern factors are depended on rationale tradeoffawareness, as shown in (1), of each business typerequirements. The optimal point consideration of data centersite availability and investment costs derived from theslopeas Fig. 1 together with the result from ROI.

    ILtGR ++ )..( (1)

    Business lost is not only depended on type of business butalso depends on time as seen in Figure 2. The relation for

    business lost and ongoing time will be exponentialcorrelation as shown in (2). Example of international bankoperates by time zone: Starting Point from Japan toAustralia, Hong Kong, Singapore, and Thailand. Thetransactions between each country will transfer overlap bytime zone. Thus, the size effected, transactions from Japan toAustralia will start fist follow by Japan to Thailand, Japan toEngland and so on, of data center downtime will accumulateand increasing damage as a chain reaction as depicted in Fig.

    2, accumulation function )( ),( tiLf .

    Assumption each down time starting by 1 andtequal or greater than 1 hour(s)

    Fig. 2. Time dependency accumulation losses [11]

    11

    1

    =

    =

    teLLft

    ii

    ti ,.)()(

    ),(

    (2)

    ),( tiL : Time dependency accumulation losses.

    675

    Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 DC Power Architecture

    3/6

    III. R ELIABILITY ASSUMPTION MODEL

    A. Tier IV Data Center Model

    TIA 942- Tier IV data center is defined as a pre-model of

    fault tolerance for risk assessment from utility incoming

    throughout loaded points, as depicted in Fig. 3.

    Fig. 3. Tier IV Data Center Diagram [16]

    B. IEEE 493-2007, 2(N+1) Model

    To enhance the critical prevention sources, UPS, a system

    requires one out of N components. The design is shown the

    parallel power supplies to critical loaded from 2(N+1)

    separated and independent operation UPS with STS and

    manual bypass. An annual availability of 2(N+1) is equal to

    99.99914% or probability of failure 16.49% during 5 years,

    as depicted in Fig. 4.

    Fig. 4. IEEE 493, 2(N+1) Power Equipment [7]

    C. Fault Tolerance DC-PDS Model

    Fault tolerance topology is the objective design to

    eliminate a single point of failure (SPoF) from DC-PDS.

    Design for cleaning power quality is mitigated by applying a

    power conditioning technology, as depicted in Fig. 5. Zero

    downtime is the main proposes of data center operation.

    System maintenance without interrupting operation is

    defined not only extended equipment life but also prevent

    equipment failure before MTTF.

    IV. POWERQUALITY ZONE ASSESSMENT

    The researcher proposes fault tolerance analysis approachmodel.

    A. High Voltage: Zone 0

    TIA 942- Tier IV, 99.995% uptime, is defined utility grids

    supporting for this model are independent each other. With

    the second utility grid 95% of power quality (PQ) problems

    can be avoided. Reliability of PQ is different from location

    to location and country to country. Especially, when

    compare between developing country and developed

    country. According to [9], Table I, a research is shown the

    reliability of PQ is only 99.74924% that means the

    downtime per year equal to 21.96657 hours. The gap

    between 99.995% requirement and real life, PQ, 99.74924%

    is called risk acceptance. Natural disaster causes poweroutage that is uncontrollable and unpredictable.

    TABLE IPOWERQUALITY DISRUPTIONS [9]

    B. Low Voltage: Zone I, Main Distribution Board

    Transformers, diesel engines, and ATSs are defined as

    critical components on this zone because the lowest

    reliability equipment is represented the lowest reliability of

    system. Diesel engine is the weakest MTBF on this model

    [7]. Since, diesel engine is the highest failure rate. Design to

    eliminate risk, the reliability, requires parallel system,

    2(N+1), to ensure the existing of power system. The rest of

    equipment is design for 2N parallel, A Side and B Side, as

    shown in Fig. 5-Zone I.

    C. Low Voltage: Zone II, Uninterruptible Power Unit

    This Zone II can define as mission critical operation for

    data center because the fist stage of power outage UPSs will

    continuous supply power to loads immediately [10], [12].UPS 2(N+1) is proposed to reduce reliability risk. Rid-

    through for power outages up to about 500 ms, this can

    handle by flywheel for 15-20 seconds on A Side. If longer

    more than flywheel can handle, UPS + batteries on B Side

    are still keeping recharging to loads.

    676

    Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 DC Power Architecture

    4/6

    Fig. 5. A New Fault Tolerance DC-PDS Model

    677

    Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 DC Power Architecture

    5/6

    As the same time, during 15-20 seconds diesel engines

    are already standby to provide emergency load back to

    support flywheel and UPS+battery. Bypass isolation

    transformer with STS is design for assurance reliability

    during on maintenance UPSs. It is provided the cleaning on

    power quality to down-steam, as shown in Fig. 5-Zone II of

    each A Side and B Side. A new design improvement toreduce MTTR is all type of circuit breakers are drawn-out

    model. Risk on PQ could be generated on this zone by non-

    linear equipments [4], [6]. However, prevention procedure is

    done through features of UPS and isolation transformer on

    Zone III before it passes through critical loads.

    D. Low Voltage, Zone III, Power Disturbance SafeOperating Zone (PD-SOZ)

    Complex failure propagation across the power systems

    need to coordinate among circuit breakers under the large

    centralized power systems. The solution for design is to

    collocate 2(N+1) UPS to supply separated independent, A

    Side and B Side, to the loads, as shown in Fig. 5- Zone III.

    Isolating transformer is applied to this zone not only to

    reduce of both the imbalance and the third harmonic of non-

    liner loads but also reduce of system electrical noise and

    increase in the power factor for a non-liner load. The result

    of parallel design for 2N of power distribution, A Side and

    B Side, compares to IEEE 493 data sheet on Table 8-1,

    page-194, shown the system MTBF equal to 188,654.5

    hours, MTTR equal to 1.64 hours, availability equal to

    99.99913%, and probability of failure during 5 years equal

    to 16.16% [7].

    V. DISCUSSION

    A Fig. 6 shows the distribution of sags and outages per

    site per year. A weight for consideration to invent to protect

    critical equipment needs to analysis from PQ history of data

    center site location. If a several year record presents 50%

    more frequency on interruption lest than 10 seconds, the

    investment on UPSs, flywheel UPSs or UPSs plus batteries,

    can be effectiveness. Other, if record presents 20-50% more

    frequency on interruption more than 10 minutes, the

    investment on diesel generators, N+1, can be effectiveness.

    The others case can be balancing between equal investment

    on UPSs and diesel generators. Researcher is recommended

    for voltage frequency independent (VFI) triple

    Classification 1 rating UPS type to improve the system

    efficiency [4], [5].

    In order to obtain the level of business continuous

    availability, the system requires the prevention processes for

    critical loaded points. The necessary processes need to take

    into consideration as follows:1) Operators require a comprehensive training on existing

    system design, power distribution system layout,

    common problems and solutions. These activities are

    preventing the manmade by commission and omission

    during daily operations and regularly maintenances.

    2) On the beginning design process, consultants or

    designers need to consider to the international standards,

    latest equipment technologies and confirm that

    technologies are mature on operations and maintenance

    procedures. The high reliability (MTTF) and correct

    sizing of selected equipment are prevented, short life

    operation period, overloaded current (trip), energy

    effectiveness, optimal investment, and maintenancecosts, as a perfect synergy.

    3) Contingency plans are required to institute to prevent

    some occurrences of national disasters that are

    unpredictable and uncontrollable situations.

    Fig. 6. Distribution of Sags and Outages per Site per Year [9]

    The relation for downtime cost model and reliability

    model is called optimum availability and investment

    tradeoffs that designers and investors need to discuss what

    the point of enough availability with constrained investment

    can achieve. The consideration shall satisfy (1). There is not

    only investment and data center availability needed to

    concern but also downtime of data center can destroy

    business as well [12].

    VI. CONCLUSION

    The next generation of data center power distribution

    system planning is required to satisfy the growing and

    changing system loaded demand during the planning period

    and critical operation under concepts of safety, reliability,

    consistency, dependability, optimization, utilization,

    efficiency and regulations. Risk analysis of data center

    power distribution system is needed to understand the nature

    of equipment function/ stage failures for preventive and

    corrective actions. A planed system downtime is much better

    than an unplanned system downtime.

    REFERENCES

    [1] G. O. Young, Synthetic structure of industrial plastics (Book stylewith paper title and editor), inPlastics, 2nd ed. vol. 3, J. Peters, Ed.New York: McGraw-Hill, 1964, pp. 1564.

    [2] A. Bendre, D. Divan, W. Kranz, and W. Brumsickle, EquipmentFailures Caused by Power Quality Disturbances, 39th IAS AnnualMeeting, Industry Application Conference, Vol.1, 3-7 Oct. 2004, pp.482-489.

    [3] B. Boehm, L. Huang, A. Jain, and R. Madachy, The ROI of SoftwareDependability The iDAVE Model, IEEE Software, May/June 2004.

    pp. 54-61.

    678

    Authorized licensed use limited to: David Ibarra. Downloaded on February 2, 2009 at 20:30 from IEEE Xplore. Restrictions apply.

  • 8/14/2019 DC Power Architecture

    6/6

    [4] K. Davidson, K. Darrow, T. Bryson, and B. Major, AdvancedMicroturbine System (AMTS) Market Study, prepared for DOE andCapstone Turbine Corporation, prepared by Onsite EnergyCorporation, April, 2001.

    [5] W. Solter, A New International UPS Classification by IEC 62040-3, 24th Annual International Telecommunications EnergyConference, 2002, INTELEC 2002, pp. 541-545.

    [6] IEC 62040-3 ED. 1.0 B: 1999, Uninterruptible power systems (UPS)

    Part 3: Method of specifying the performance and test requirements,1999.

    [7] IEEE Std 446-1995, (Revision of IEEE Std 446-1987), IEEERecommended Practice for Emergency and Standby Power Systemsfor Industrial and Commercial Applications, 12 December 1995.

    [8] IEEE Std 493-2007, (Revision of IEEE 493-1997), RecommendedPractice for Design of Reliable Industrial and Commercial PowerSystem, Gold Book, 7 February 2007.

    [9] IEEE Std 1100-1999, (Revision of IEEE Std 1100-1992), IEEERecommendation Practice for Powering and Grounding ElectronicEquipment, 22 March 1999.

    [10] K. Darrow, and B. Hedman, The Role of Distributed Generation inPower Quality and Reliability, New York State Energy Research andDevelopment Authority, December 2005.

    [11] M. Wiboonrat, An Empirical Study on Data Center System FailureDiagnosis, 3rd International Conference on Internet Monitoring andProtection, IEEE ICIMP 2008, Romania, June 29-July 5, 2008,accepted for publication.

    [12] M. Wiboonrat, An Optimal Data Center Availability and InvestmentTrade-Offs, 9th International Conference on Software Engineering,Artificial Intelligence, Networking, and Parallel/ DistributedComputing, IEEE SNPD 2008, Thailand, August 6-8, 2008, acceptedfor publication.

    [13] M. Wiboonrat, Dependability Analysis of Data Center Tier III, 13th

    International Telecommunications Network Strategy and PlanningSymposium, NETWORKS 2008, Budapest, Hungary, Sept 28- Oct 2,2008, accepted for publication.

    [14] M. Wiboonrat, Power Reliability and Cost Trade-Offs: AComparative Evaluation between Tier III and Tier IV Data Centers,Power Conversion and Power Management, Digital Power Forum2007, San Francisco, CA, September 10-12, 2007.

    [15] M. Wiboonrat, Beyond Data Center Tier IV ReliabilityEnhancement, Power Conversion and Power Management, DigitalPower Europe 2007, Munich, Germany, November 13-15, 2007.

    [16] M. Wiboonrat, and C. Jungthirapanich, Reliability Enhancement viathe Failure Modes, Effects, and Criticality Analysis (FMECA) and the

    Reliability Block Diagram (RBD), 8 th International Conference onOpers. & Quant. Management, ICOQM 2007, Bangkok, Thailand,October 17-20, 2007.

    [17] Turner IV, W. P., J. H. Seader, V. Renaud, and K. G. Brill, TierClassification Define Site Infrastructure Performance, White Paper,The Uptime Institute, Inc. 2008.

    679