quartermaster: a resource utility · november 10, 2003 page 3 utility computing motivations and...
TRANSCRIPT
© 2003 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
Quartermaster: A resource utility
Sharad SinghalNovember 10, 2003
Contributors
ISSL:
Martin Arlitt
Sven Graupner
Ralf Koenig
Vijay Machiraju
Jim Pruyne
Jerry Rolia
Akhil Sahai
Xiaoyun Zhu
IETL:
Dirk Beyer
Pano Santos
Julie Ward
SRC*:
Rajeev Joshi*
page 2November 10, 2003
Utility ComputingA customer perspective
Pay as you go7%
Utility computing
5%Capacity
on demand36%
Computing on demand
20%
Pay as you grow32%
• Offering computing as a service
• Customer pays for only for the computing resources used
• Customer uses as much as needed, when needed
N=125
Which term do you think best describes the concept of Service Centric Computing?
page 3November 10, 2003
Utility ComputingMotivations and players
• Confluence of factors– IT infrastructure has matured enough to be treated as a “commodity”– Pressures to reduce cost through consolidation and outsourcing– Interest in sharing resources, data, …– Desire to improve agility by making the computing “on-demand”
• Players– Provisioning frameworks/virtualization
• HP (UDC), IBM (on-demand computing), Sun (N1), Microsoft (DSI)
• ThinkDynamics (now IBM), Egenera, Veritas, Ejascent
– Outsourcers/managed environments• Opsware, Terraspring (now Sun)
– Grid/Clustering• Platform Computing, Avaki
page 4November 10, 2003
Utility ComputingChallenges
• Computing cannot be treated as a commodity in the enterprise.– Legacy– Fragile applications– Every environment is different
• Computing cannot be obtained by just “plugging it in.”– Application/System configurations are “hardwired”– Lack of common standards– Too many “standards”
• Business models do not support utility computing.– Outsourcers are just beginning to understand the costs involved– SLAs on computing are unclear– Customers do not yet trust utility computing
page 5November 10, 2003
Utility ComputingRequirements
• A virtualized view of the infrastructure– Provide infrastructure abstractions required by applications – Separate infrastructure evolution from application evolution
• Design tools that support deployment, configuration, and lifecycle management of applications– Customize applications to specific deployments– Create “Built-to-Order” computing environments– Capture user and operator policies in the tools
• Management tools that enable efficient use of infrastructure– Bring economy of scale to infrastructure management/use– Ensure application level requirements are met
page 6November 10, 2003
Resource Pool
Resource Utility Service
GridServices
Architecting a resource utility
Our vision is to create a "compute resource utility" that allows HP customers to provide computation resources to applications and
services "on-demand."
Service Portals
Applications
System
s M
anagement
Services
Managem
ent
Resource Allocation System
Service Deployment
System
An architecture blueprint for IT deploymentStandardized mechanisms for deployment of resources, applications, and services
Standardized APIs for interaction with resources and applications
A unified view of an increasingly heterogeneous infrastructure
A partner and services ecosystemAn ecosystem for middleware and software partners
A repeatable consulting framework for HP Services and integration partnersA standardized resource
access frameworkAdaptation of utility resources to the emerging Grid marketplace
Quarterm
aster
Vision
page 7November 10, 2003
Overview
• Architecture of Quartermaster– Research objectives and challenges– Resource request use case– Quartermaster architecture components
• Quartermaster research results– Policy-based resource construction– Grid-based resource reservation & scheduling– Optimized resource allocation and application provisioning– Model-based resource integration
• Summary & future work
page 8November 10, 2003
Quartermaster research objectives
• Develop a set of tools, technologies, and techniques that – can take into account
• the variety of resources available;• anticipated application workloads;• utility functions specified by customers;• dependencies among the above; and
– can automatically • decide which resources to allocate, and when
– while• resolving (possibly contradictory) demands on resources,• avoiding resource bottlenecks or “hot spots,”• handling non-additive resource allocations,• conforming to policies (e.g., managing risk, meeting figures of merit,
failure handling) specified by the operator,• accommodating dynamic addition and removal of resources.
• Use the tools to demonstrate a working resource allocation prototype at a data-center scale
page 9November 10, 2003
Issues in dynamic resource allocationWhat’s hard?
• Dynamic use of resources– Resource demands vary over time
• Capacity management, admission control, policing• Workload characterization• Resource allocation and assignment
– Multiple resource types are involved• Resource composition and abstractions• Non additive resources
– Users/Operators have preferences• Policy reconciliation/management
• Dynamic resource pool• Life-cycle management
• Federated environment– Resources move between applications/customers
• Security• Configuration management
It is not sufficient to just assign servers/virtual machines from a cluster to applications– in complex environments, many other hard engineering constraints and operator policies must be met when allocating resources!
Most related work does not take the complexity of the environment into account.
page 10November 10, 2003
Imagine…
• Today:– IT designer takes1 week to design the site
configuration. – It takes time to scrounge up resources– so
new ones usually have to be ordered at a premium.
– It takes three weeks to get the environment built and running, and another week to test it.
– Designs are a combination of guess and check, so the process requires a couple of iterations.
– IT designer promises to deliver the system in 12 weeks to cover all contingencies.
• Tomorrow:– The IT designer drags in an e-commerce site
on a palette available from a resource utilityand enters requirements as policies (~5 minutes)
– A complete description of the site is created automatically (~10 minutes)
– The resource utility marshals the resources, and deploys the system (~few hours)
– The designer makes on-line changes as content is added and the system is tested to ensure requirements are met (~1 week)
– If necessary, low priority work is re-scheduled to allow time for new resources to be procured.
– Three weeks later, when new resources arrive, they are plugged in, and the utility re-adjusts work to meet current demands
An IT manager receives an urgent request to meet certain business requirements:
“I need an e-commerce system that is capable of handling peaks of 300,000 queries per minute for a sales promotion starting two weeks from now.”
page 11November 10, 2003
Quartermaster Architecture
1. User uses a resource composition service to design a custom environment (or selects a pre-configured template).
2. User schedules deployment of application.
3. Resources needed for the deployment are assigned.
4. Service is deployed, and5. Resources are made available to
user.6. On-line monitoring is used to adjust
resources as necessary.7. Resource availability & utilization is
used to improve future resource assignment decisions.
8. The type/inventory repository tracks any changes in resources.
GridGrid
Resourceallocation
Resourceassignment
Resourcecomposition
Servicedeployment
Operationscontrol
Resources
1
2Resource
type & inventoryrepository
3
4
4
5
6
6
6
78
page 12November 10, 2003
ResultsPolicy-based resource composition
• Given – Models of resources and
applications with• Constraints embedded in these
models from resource/application vendors,
– Policies specified by system operators on how resources will be used,
– Requirements on desired system by the user of the system,
• Automatically construct– A system specification that
is provably compliant withall constraints, policies, and requirements.
policy
kind of
resource
attributes
has
refers to
*
1..*
1..*
*
*:constructionPolicy
:constraint
associated with
Key learning:
We are using Constraint Satisfaction (SAT) solvers to create custom environments based on policy. Our initial results are encouraging.
Akhil Sahai, Sharad Singhal, Rajeev Joshi, Vijay Machiraju, Automated Policy-Based Resource Construction in Utility Environments – To appear in NOMS 2004
page 13November 10, 2003
ResultsPolicy-based resource composition
• Value:– Rapidly create “build-to-order” systems that satisfy multiple requirements– Leverage domain expertise through hierarchical and modular templates during
system construction• Technical innovation:
• A method of specifying and embedding policies in system models. These policies are expressed as first-order logic and linear arithmetic expressions, so that systems may be constructed that are modular, flexible, build upon each other and capture as policies,
– users’ requirements (that may be minimally specific), – operator’s constraints, – technical capabilities of the systems and the corresponding constraints
• A constraint satisfaction engine that uses formal logic and reasoning to combine and solve all the relevant policies and produce a feasible solution.
page 14November 10, 2003
1
23
page 15November 10, 2003
page 16November 10, 2003
page 17November 10, 2003
ResultsGrid based resource scheduling
• Assuming– Resource pools will be
distributed across administrative domains, and
– Grid mechanisms will be used to access such resource pools,
• Identify– What protocols and APIs will
be used to discover, reserve, and access resources, and
• Lead– The adoption of such protocols
in the Global Grid Forum (GGF), and
• Ensure– HP divisions are positioned to
take advantage of any standards that emerge from the GGF.
Results
Jim Pruyne and Vijay Machiraju, Quartermaster: Grid services for data center resource reservation, Workshop on Designing and Building Grid Services (held in conjunction with GGF 9), Chicago, Illinois, October 8, 2003.
Jerry Rolia, Jim Pruyne, Xiaoyun Zhu, Martin Arlitt, Grids for Enterprise Applications, JSSPP 2003
Graupner, S., Kobiyama, J.: The Road to the Enterprise Grid: Enabling Technology for Managing Data Center Resources Using HP Utility Data Center, Industrial Track Presentation at CCGrid 2003, Tokyo, Japan, May 12-15, 2003
page 18November 10, 2003
ResultsGrid based resource scheduling
• The Problem: Future resource utilization plans are difficult to maintain and align with changing business priorities
• Value Proposition: A grid-based scheduling system that arbitrates among applications and adapts to change in demand and capacity– Aligns resource allocation with business priorities– Helps operators make better decisions
• Core Innovations– Scheduling directed by policies– Specification for complex resource utilization patterns– Users and operators able to run “what-if” scenarios
page 19November 10, 2003
page 20November 10, 2003
page 21November 10, 2003
page 22November 10, 2003
ResultsResource Allocation
• Given– Specifications of time-varying
demands coming from multiple applications, and
– SLA requirements of these applications for various classes of service in terms of resource availability,
• Define– Methods to characterize and
describe such demands, and• Obtain
– Bounds on server pool capacity,– Utilization improvements that can
be obtained by statistical guarantees,
– Quantitative probabilities of SLA violation, and
– Admission control and policing mechanisms J. Rolia, X. Zhu, M. Arlitt and A. Andrzejak, “Statistical Service
Assurance for Applications in Utility Grid Environments” - MASCOTS 2002
page 23November 10, 2003
ResultsResource assignment
• Given• a physical network topology and link
capacities,
• a set of servers with varying capabilities,
• An application architecture, and
• application requirements for communication, processing, and storage.
• Find– the “best” assignment of servers to
application components, which • optimizes performance/policy
based objective,
• satisfies server requirements, and
• avoids network congestion.
Results
Key learning:
We can take (multiple) applications with arbitrary topology, computing, and communication requirements, and optimally place such applications on arbitrary LAN fabric connecting heterogeneous compute resources, using either local storage or storage arrays connected via a core-edge SAN fabric.
X. Zhu et. al., Resource assignment for large scale computing utilities using mathematical programming, Submitted to NSDI ’04.
page 24November 10, 2003
ResultsResource assignment
• Value proposition – Automate dynamic resource assignment
• Quickly solves complex assignment problems that are beyond the scope of a human operator
• Creates optimal assignments to avoid bottlenecks
• Responds to changes in user requirement in “real-time”
• Core Innovation shown– Solved a very complex combinatorial non-linear NP-hard problem by
transforming it into a linear one that can be handled efficiently by commercial solvers
– Advantage of approach: Clearly indicates if a feasible optimal solution exists and if not, why not.
page 25November 10, 2003
page 26November 10, 2003
page 27November 10, 2003
page 28November 10, 2003
ResultsResource modeling
• Given– A resource pool with a variety of
resources from various vendors,– New types of resources that
were not anticipated when the system was built,
• Create– Models that easily accommodate
changes in the resource pool,– Generic management
functionality that works with such heterogeneous resource pools,
• To support– Plug-n-play instantiation of
• both resource types and instances• both grounded and constructed
resources
super type
nameassociation typequalifiers
association type
super type
namevalue typedefault valueproperty qualifiers
property
namein/out parametersreturn parametermethod qualifiers
method
policy
namevalue typedefault valueparameter qualifiers
parameter
reference qualifiers
reference
nameresource type qualifiers
resource type
Results
Key learning:
We believe resources can be modeled using existing CIM standards, and our management system will abstract and leverage these models. Such models will provide the abstractions necessary to manage heterogeneous resources.
page 29November 10, 2003
ResultsModel-based resource integration
• The problem– Takes a few months to integrate a new resource type into a utility environment.– A simple change in an existing resource type takes several weeks.– Management software needs to be re-written and then re-installed in the
customer’s environment.• Value proposition
– Rapid resource integration into the utility.• few months reduced to few minutes.
• no management software refresh is needed at customer’s site.
• Core innovation– We have architected our resource utility in a model-driven manner.– Models describe what resources and their capabilities are, and the
management software does not have to change when models change.– Model are based on industry standards in management.
page 30November 10, 2003
page 31November 10, 2003
page 32November 10, 2003
page 33November 10, 2003
page 34November 10, 2003
Summary
• We have solved some very challenging problems dealing with resource management– Application and infrastructure agnostic– Provably “hard” (NP) in the general case
• We have taken a leadership position in a targeted area (GRAAP) in GGF that deals with resource allocation/reservations.
• We have provided all of the messaging HP has done around the UDC/Grid for 2003.
• We have defined the “control” specifications/APIs that are now being shared with partners by the UDC team.
• We have an architecture and components that we are currently integrating to create a test-bed/demonstrator.
page 35November 10, 2003
Summary & FuturesAn integrated testbed for policy-driven computing services
• Our challenge is to build a resource utility that provides resources to applications– Summary
• Quartermaster is addressing challenges in resource modeling, resource composition, and resource allocation.
– Future: • Develop prototype of an automated, adaptive utility
– Automation: Automate parts of the design, deploy, manage lifecycle for application environments
– Intelligence: Statistical techniques to determine if SLAs are met– Framework: Instrumentation to feed statistical techniques and mechanism to work
with dynamic resource assignment• Leverage research in other groups/universities for
– Resource virtualization– Grid standards and protocols– Management models and policy specifications– Service deployment and lifecycle management
page 36November 10, 2003
HP logo