1 support for dynamic adaptation of power-aware server clusters vinicius petrucci, orlando loques...

Support for dynamic adaptation ofpower-aware server clusters

Vinicius Petrucci, Orlando LoquesFluminense Federal University, Brazil

Daniel MosséUniversity of Pittsburgh, USA

March, 2009

Research context

• Dynamic computing environments– varying workloads– resources variability (including component failures)– changing user needs

• Applications have to cope with changes– adaptive behavior requirement

• Support for dynamic adaptations– reusable infrastructure– adaptation language

Application cases• Server Clusters

– power optimization and QoS control• Wireless sensor networks

– bandwidth availability, data reliability (accuracy), power optimization

• Overlay networks– topology reconfiguration

• Grids– shared (heterogeneous) resources with varying quality and

availability• Pervasive / ubiquitous computing

Wireless sensor networksDistributed autonomous devices which rely on sensors to cooperatively monitor physical or environmental conditions

* Wikipedia.org

Example: energy optimization can be achieved by turning some sensors on/off

Videoconferencing system

Set of servers (called reflectors) that route the audio/video streamsto the participating clients.

Example: monitoring and control of the reflector configuration to meet the QoS

refUFF

refLMPD

refUERJ

Clients

Server clusters

Example: power optimization while meeting performance / QoS requirements

Clients have a single view through ServerCluster component (load balancer) and requests are processed by back-end servers

Problem

• Adaptive policies for applications– implementation may be complex in itself– most of those are implemented in ad-hoc fashion

• Code for adaptation policies is– mixed with the application code– costly and difficult to modify and maintain in a real

operational environment

Approach

• Generic solution to support adaptations– external reusable infrastructure to monitor and

adapt running applications– contract-based adaptation language for

representing high-level policies

• Software architecture abstractions– representation of application configurations– stored as meta-level data (object model)

Related work

• Rainbow (CMU)– adaptation language + supporting infrastructure

• Autonomic managers (IBM)– provides a generic view of autonomic computing

• Jade (INRIA)– lack of adaptation knowledge representation

• CASA (Univ. of Zurich)– contract-based language using XML

• We propose a lightweight approach based on scripting/dynamic language facilities

Autonomic computing (IBM)

Knowledge: adaptation models, data, and scripts

General feedback control loop

Adaptation framework

Adaptation language• Profiles

– conditions for triggering adaptations• Adaptations

– steps to move an application away from an undesirable condition

• Negotiation clauses– particular order to deploy the adaptations

• Constructs: adapt_period, settling_time– cater for timing issues of adaptation

Adaptation framework

Scripting languages

• Scripting/dynamic language (Python)– high-level abstractions for expressing

dynamic adaptation policies– built-in functions simplify infrastructure

development (e.g., compile, exec)• Abstract adaptation operators

– mapped to application-level operations at run-time

– may rely on APIs provided by the app support level (e.g., Apache modules API)

Multiple adaptation contracts

• Support for multiple domains of adaptation– each contract has one thread of control

• Simple concurrency model– global locking mechanism– First-Come, First-Serve approach

while contract.running:

for a in contract.adaptations:

if a.profile is True:

execute adaptation code of “a”

sleep for “settling_time” interval

sleep for “adapt_period” interval

The case of server clusters

• Server utilization remains very low– average about 6%

• Energy consumption is high and growing– about 9% per year

• Carbon emissions are set to quadruple by 2012– projected to surpass the airline industry

• Great opportunity for dynamic adaptations

Source: Uptime Institute (McKinsey & Co. Report --- http://uptimeinstitute.org)

Dynamic adaptations

• Dynamic adaptation capabilities– CPU DVFS (dynamic voltage/frequency scaling)– server on/off mechanisms (e.g., suspend-to-RAM +

wake-on-LAN)

• Power and performance trade-off– servers' capacity management to reduce energy

consumption– guarantee of QoS requirements (e.g., utilization or

response time)

Configuration problem

N = number of servers; Fi = number of frequencies of the server i p_busy, p_idle = power costperf = servers’ performance Xij = decision variabledemand = incoming workload

minimize the overall powerconsumption

associate decision variable xij with objective function variables

select only one frequency on a given server

handle the incoming workload (given by demand)

Adaptation example

• Thresholds for cluster utilization– e.g., T_LOW = 0.70 and T_HIGH = 0.85

profile { webcluster.load / webcluster.maxLoad() < T_LOW} util_low;

profile { webcluster.load / webcluster.maxLoad() > T_HIGH} util_high;

Adaptation example

contract { adaptation { demand = webCluster.load / T_HIGH changeConf = webCluster.bestConfig(demand) for (s, f) in changeConf: if f == 0: webCluster.turnOff(s) else: if s.status == 0: webCluster.turnOn(s) webCluster.adjustFreq(s,f) } adjustCluster with util_low or util_high \ settling_time 6000/*ms*/;

} decision1 adapt_period 5000/*ms*/;

Adaptation example

contract { adaptation { demand = webCluster.load / T_HIGH changeConf = webCluster.bestConfig(demand) for (s, f) in changeConf: if f == 0: webCluster.turnOff(s) else: if s.status == 0: webCluster.turnOn(s) webCluster.adjustFreq(s,f) } adjustCluster with util_low or util_high \ settling_time 6000/*ms*/;

} decision1 adapt_period 5000/*ms*/;

Adaptation example

• Common monitoring support– e.g., variable access: webcluster.load

• Reusable adaptation operators– e.g., webcluster.turnOn(), webcluster.turnOff()

• Some of policy-specific operators can also be defined– e.g., webcluster.bestConfig()

• Different adaptation polices can be used

Application-specific layer

• Apache built-in load balancer module– mod_proxy_balancer

• New apache module in C (mod_frontend)– Expose an API (XML-RPC) for

• monitoring system properties• controlling the front-end web server

– Example• sensors -> load (req/s), req. response time• actuators -> DVS, On/Off

Experimental evaluation

• Dedicated web cluster testbed

Controlling cluster utilization

Power/energy savings

34* Energy consumption reduction of ~ 37% compared to not using adaptations

Using different quality metric

Supporting multiple contracts

36* Running concurrent adaptation contracts: power management and fault tolerance

Different adaptation policies

Disruption : the number of turning on (and off) adaptations, which may involve a switching cost.

What is the best way to minimize disruption AND energy consumption ??

Future study : anticipatory adaptation model, risk-aware controller ...

Adaptation time overhead

• The worst case measured (overall adaptation phase): 13,045.78 ms• Operations: 1 on, 1 off, and 2 adj. freq. => 12,012ms + 1,005ms + 7ms + 7ms = 13,013ms• Framework overhead = 32.78 ms

Conclusion

• Framework-based approach to support dynamic adaptations– power and performance management for server

clusters

• Re-usability of the adaptation infrastructure– simplifies both evaluation and management of

different adaptation policies / requirements– helps to reduce the development cost of adaptive

applications

Future work

• Improvements in the framework– forecasting for adaptation decisions

• Other power-aware adaptations– multi-core architecture / memory systems

• Optimization algorithms for adaptation– processor allocation among multiple services /

applications• Experimental evaluation

– virtualization -> consolidation, live migration– more realistic/real workloads

Power and performance

Fault tolerance contract

contract { adaptation {

srv = webcluster.getFailedServer() newsrv = webcluster.allocNewServer() if newsrv: webcluster.replaceServer(srv, newsrv) else: webcluster.log(“could not allocate server”)

} repair with server_fail settling_time 4000/*ms*/;

} fault_tolerance adapt_period 1000/*ms*/;

profile { webcluster.failure > 0 } server_fail;

Fault tolerance contract

contract { adaptation {

srv = webcluster.getFailedServer() newsrv = webcluster.allocNewServer() if newsrv: webcluster.replaceServer(srv, newsrv) else: webcluster.log(“could not allocate server”)

} repair with server_fail settling_time 4000/*ms*/;

} fault_tolerance adapt_period 1000/*ms*/;

profile { webcluster.failure > 0 } server_fail;

Filter modules

Holt's method

1 support for dynamic adaptation of power-aware server clusters vinicius petrucci, orlando loques...

adaptation framework

adaptation policies

adaptation models

dynamic adaptation of

sensors onoff slide

timing issues of adaptation

zurich contractbased

autonomic computing

Documents

universidade federal fluminense matheusnohrahaddad

revista vinicius argentina nº19

- mossé security - home · csi mossÉ cyber security st it...

vinicius lima booklet

carlos vinicius aladim veras final

revista lorrany e vinicius

revista de economia fluminense

lagoa do norte fluminense

apresentação petrov vinicius

vinicius de moraes

vinicius de moraes - vida e obra

virginie mossé - artbutler · virginie mossé still-life...

wedding alciane e vinicius

a handbook of middle english by fernand mossÉ...

btev workshopnashville, nov 15, 2002 mossé, pitt btev-rtes...

crime organizado - masson, cleber; marcal, vinicius

portifõlio marcus vinicius

vinicius de moraes 1

nextgeneraon,penetraon,tes*ng - mossé security

superinteligências - com vinicius soares - ainews