exploring cloud for data warehousing
DESCRIPTION
Description of the basic cloud principles, the cost & deployment model for cloud, shortcomings for BI workloads beyond modest scale, some stats on market adoption/preference of cloud for DW.TRANSCRIPT
Cloud Computing
" a model for enabling ubiquitous convenient on …a model for enabling ubiquitous, convenient, on‐demand network access to a shared pool of configurable computing resources (e g networks servers storagecomputing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal managementprovisioned and released with minimal management effort or service provider interaction."
http://csrc nist gov/publications/nistpubs/800-145/SP800-145 pdfhttp://csrc.nist.gov/publications/nistpubs/800 145/SP800 145.pdf
What people see: seemingly infinite resource to apply to performance problems on short notice and at low cost
Generators: Expensive Product
Generators: Commodity Product
Generators as a Service: Electricity
The Natural Process of Commoditization
Simon Wardley, A Lifecycle Approach to Cloud Computing
Managing Hardware Resources
Systems are sized for the peak workload, with the expectation that it will fluctuate.
Capacity
Demand
Resources
Time
Idle resources = low utilizations = money wastedIdle resources low utilizations money wasted
CapacityCapacityIdle resources
Demand
Resources
Time
Not enough resource is (much) worse than too much.
Capacity
Demand
Capac tyResources
Time
Maintaining capacity just above the peak asMaintaining capacity just above the peak as workloads increase is the art of capacity planning.
One problem is the large step when upgrading toOne problem is the large step when upgrading to more resources, equating to a large capital cost.
CapacityCapacity
Demand
Resources
e a d
Time
Great performance after an upgrade, badGreat performance after an upgrade, bad performance at year‐end before the next upgrade.
A steady decline can be worse for user perceptionA steady decline can be worse for user perception than constant mediocre performance.
Capacityp yIdle
Demand
Resources
Time
What everyone would like: elastic capacity
Pay for the resources you use when you use them,Pay for the resources you use when you use them, not up front for the entire system that supplies them. Just like electricityJust like electricity.
Capacityp y
Resources
D d
Time
Demand
Time
Five Key Cloud Characteristics
1. On‐demand self‐service
2. Network accessibility
3. Resource pooling
4 Measured service4. Measured service
5. Elasticityy
Cloud Architecture
Started with virtual machines
M M M M MMem Mem Mem Mem Mem
Lots of servers, lots of virtual nodes. But in public clouds:
CPU
Disk
CPU
Disk
CPU
Disk
CPU
Disk
CPU
Disk
• Storage can, often is separated
• VMs don’t run across nodesDisk Disk Disk Disk Disk
• Great for OLTP, not so much for BI
• Implies new software architecturesMemory
pMem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
CPUsMemory
CPUs
Memory
CPUs
Memory
CPUs
Shared disk Shared disk Shared disk Shared disk
Database Architecture and the Cloud
Virtualizing on a single server makes no sense for a database that needs
If your server hardware environment looks like this:
the full resources.
then it’s probably good for
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Disk
Mem
CPU
Mem
CPU
Mem
CPU
Mem
CPU
Mem
CPU
p y glightweight transaction processing, simple storage and
i l d lDisk Disk Disk Disk Disk
retrieval, procedural computations on data.
MemoryIf you want to use it for a data warehouse, you need:
CPUs
Shared disk
• A shared‐nothing database• A proper storage architecture• D i li i• Dynamic licensing
Three Models of Deployment
2 Leased / hosted2. Leased / hosted private cloud
1. Public cloud
3. Private cloud
Benefits and Rationale
Wh did / id i t th l d?Why did you / are you considering a move to the cloud?
Two primary reasons:▪ Cost reduction▪ Reduced time to value
47%
50%
Hardware savings
Pay only for what we use
42%
44%
46%
Lower outside maintenance costs
Lower labor costs
Software license savings Cost reduction
40%
40%
42%
Able to take advantage of latest functionality
Reduce IT support needs
Lower outside maintenance costs
39%
39%
39%
Able to scale IT resources to meet needs
Relieve pressure on internal resources
Rapid deploymentReduce time to value
39%Resolve problems related to updating/upgrading
IBM global survey of IT and line-of-business decision makers
Unexpected Benefits
Speed to deploy:▪ opex vs capex means faster approvals and less planningless planning
▪ Provision on‐demand means ability to do all those small projects that needed resourcesthose small projects that needed resources and staff to set up
Performance management:▪ Resource‐oriented fixes done in minutes
▪ Instead of static resources and fluctuations in performance, set static SLAs and fluctuate the resources
Administration:Administration:▪ No more hardware or operating system upgrades to deal withupgrades to deal with
Public Cloud Challenges
1. Multi‐tenant servers and unpredictable I/O performance
2. Legal problems:▪ Data co‐mingling in multi‐tenant databases
▪ Data locality and national laws3. Cloud compatibility for data
integration and data management ( )tools (environment, data movement)
4. Security requirements
When these are a concern, private clouds may be the better option today.
What are manager preferences?
Prefernot touse cloud
21%44%
35%Data warehouses or data
marts
Prefer not to use cloud
Private cloud preference
Public cloud preference
9%52%
39%Data mining, text mining, or
other analytics9%
y
IBM global survey of IT and line-of-business decision makers
Comparison of Models
New and growing use cases drive the need to expand
The use cases are now interactive applications, lower latency data, complex analytics and rapidly growing data volumes.
Image Attributions
Thanks to the people who supplied the images used in this presentation:
Commoditization diagram – from A Lifecycle Approach to Cloud Computing, © Simon WardleyCommoditization diagram from A Lifecycle Approach to Cloud Computing, © Simon Wardleytesla coil train ‐ http://www.flickr.com/photos/winterhalter/27364687Amazon Virtual Private Cloud diagram‐© Amazon, Inc..caged_tower_melbourne.jpg ‐ http://www.flickr.com/photos/vermininc/2227512763
About the PresenterMark Madsen is president of Third Nature, a technology research and consulting firm focused on businessconsulting firm focused on business intelligence, analytics and information management. Mark is an award-winning author architect andaward winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Markpublications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian pInstitute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages g p gthe open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.
About Third Nature
Third Nature is a research and consulting firm focused on new and emerging technology and practices in business intelligence, analytics and performance management. If your question is related to BI, analytics, p g y q yinformation strategy and data then you‘re at the right place.
Our goal is to help companies take advantage of information-driven t ti d li ti W ff d ti ltimanagement practices and applications. We offer education, consulting
and research services to support business and IT organizations as well as technology vendors.
We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets e al ating technolog and h it isemerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.