exploring cloud for data warehousing

25
Exploring Cloud Computing Options for Data Warehousing July 26, 2012 Mark Madsen @markmadsen www.ThirdNature.net

Upload: mark-madsen

Post on 27-Jan-2015

105 views

Category:

Technology


2 download

DESCRIPTION

Description of the basic cloud principles, the cost & deployment model for cloud, shortcomings for BI workloads beyond modest scale, some stats on market adoption/preference of cloud for DW.

TRANSCRIPT

Page 1: Exploring cloud for data warehousing

Exploring Cloud Computing Options for Data Warehousing

July 26, 2012

Mark [email protected]

Page 2: Exploring cloud for data warehousing

Cloud Computing

" a model for enabling ubiquitous convenient on …a model for enabling ubiquitous, convenient, on‐demand network access to a shared pool of configurable computing resources (e g networks servers storagecomputing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal managementprovisioned and released with minimal management effort or service provider interaction." 

http://csrc nist gov/publications/nistpubs/800-145/SP800-145 pdfhttp://csrc.nist.gov/publications/nistpubs/800 145/SP800 145.pdf

What people see: seemingly infinite resource to apply to performance problems on short notice and at low cost

Page 3: Exploring cloud for data warehousing

Generators: Expensive Product

Page 4: Exploring cloud for data warehousing

Generators: Commodity Product

Page 5: Exploring cloud for data warehousing

Generators as a Service: Electricity

Page 6: Exploring cloud for data warehousing

The Natural Process of Commoditization

Simon Wardley, A Lifecycle Approach to Cloud Computing

Page 7: Exploring cloud for data warehousing

Managing Hardware Resources

Systems are sized for the peak workload, with the expectation that it will fluctuate.

Capacity

Demand

Resources

Time

Page 8: Exploring cloud for data warehousing

Idle resources = low utilizations = money wastedIdle resources   low utilizations   money wasted

CapacityCapacityIdle resources

Demand

Resources

Time

Page 9: Exploring cloud for data warehousing

Not enough resource is (much) worse than too much.

Capacity

Demand

Capac tyResources

Time

Page 10: Exploring cloud for data warehousing

Maintaining capacity just above the peak asMaintaining capacity just above the peak as workloads increase is the art of capacity planning.

One problem is the large step when upgrading toOne problem is the large step when upgrading to more resources, equating to a large capital cost.

CapacityCapacity

Demand

Resources

e a d

Time

Page 11: Exploring cloud for data warehousing

Great performance after an upgrade, badGreat performance after an upgrade, bad performance at year‐end before the next upgrade.

A steady decline can be worse for user perceptionA steady decline can be worse for user perception than constant mediocre performance.

Capacityp yIdle

Demand

Resources

Time

Page 12: Exploring cloud for data warehousing

What everyone would like: elastic capacity

Pay for the resources you use when you use them,Pay for the resources you use when you use them, not up front for the entire system that supplies them. Just like electricityJust like electricity.

Capacityp y

Resources

D d

Time

Demand

Time

Page 13: Exploring cloud for data warehousing

Five Key Cloud Characteristics

1. On‐demand self‐service

2. Network accessibility

3. Resource pooling

4 Measured service4. Measured service

5. Elasticityy

Page 14: Exploring cloud for data warehousing

Cloud Architecture

Started with virtual machines

M M M M MMem Mem Mem Mem Mem

Lots of servers, lots of virtual nodes. But in public clouds:

CPU

Disk

CPU

Disk

CPU

Disk

CPU

Disk

CPU

Disk

• Storage can, often is separated

• VMs don’t run across nodesDisk Disk Disk Disk Disk

• Great for OLTP, not so much for BI

• Implies new software architecturesMemory

pMem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

CPUsMemory

CPUs

Memory

CPUs

Memory

CPUs

Shared disk Shared disk Shared disk Shared disk

Page 15: Exploring cloud for data warehousing

Database Architecture and the Cloud

Virtualizing on a single server makes no sense for a database that needs 

If your server hardware environment looks like this:

the full resources.

then it’s probably good for 

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Disk

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

p y glightweight transaction processing, simple storage and 

i l d lDisk Disk Disk Disk Disk

retrieval, procedural computations on data.

MemoryIf you want to use it for a data warehouse, you need:

CPUs

Shared disk

• A shared‐nothing database• A proper storage architecture• D i li i• Dynamic licensing

Page 16: Exploring cloud for data warehousing

Three Models of Deployment

2 Leased / hosted2. Leased / hosted private cloud

1. Public cloud

3. Private cloud

Page 17: Exploring cloud for data warehousing

Benefits and Rationale

Wh did / id i t th l d?Why did you / are you considering a move to the cloud?

Two primary reasons:▪ Cost reduction▪ Reduced time to value

47%

50%

Hardware savings

Pay only for what we use

42%

44%

46%

Lower outside maintenance costs

Lower labor costs

Software license savings Cost reduction

40%

40%

42%

Able to take advantage of latest functionality

Reduce IT support needs

Lower outside maintenance costs

39%

39%

39%

Able to scale IT resources to meet needs

Relieve pressure on internal resources

Rapid deploymentReduce time to value

39%Resolve problems related to updating/upgrading

IBM global survey of IT and line-of-business decision makers

Page 18: Exploring cloud for data warehousing

Unexpected Benefits

Speed to deploy:▪ opex vs capex means faster approvals and less planningless planning

▪ Provision on‐demand means ability to do all those small projects that needed resourcesthose small projects that needed resources and staff to set up

Performance management:▪ Resource‐oriented fixes done in minutes

▪ Instead of static resources and fluctuations in performance, set static SLAs and fluctuate the resources

Administration:Administration:▪ No more hardware or operating system upgrades to deal withupgrades to deal with

Page 19: Exploring cloud for data warehousing

Public Cloud Challenges

1. Multi‐tenant servers and unpredictable I/O performance

2. Legal problems:▪ Data co‐mingling in multi‐tenant databases

▪ Data locality and national laws3. Cloud compatibility for data 

integration and data management ( )tools (environment, data movement)

4. Security requirements

When these are a concern, private clouds may be the better option today.

Page 20: Exploring cloud for data warehousing

What are manager preferences?

Prefernot touse cloud

21%44%

35%Data warehouses or data 

marts

Prefer not to use cloud

Private cloud preference

Public cloud preference

9%52%

39%Data mining, text mining, or 

other analytics9%

y

IBM global survey of IT and line-of-business decision makers

Page 21: Exploring cloud for data warehousing

Comparison of Models

Page 22: Exploring cloud for data warehousing

New and growing use cases drive the need to expand

The use cases are now interactive applications, lower latency data, complex analytics and rapidly growing data volumes.

Page 23: Exploring cloud for data warehousing

Image Attributions

Thanks to the people who supplied the images used in this presentation:

Commoditization diagram – from A Lifecycle Approach to Cloud Computing, © Simon WardleyCommoditization diagram  from A Lifecycle Approach to Cloud Computing, © Simon Wardleytesla coil train ‐ http://www.flickr.com/photos/winterhalter/27364687Amazon Virtual Private Cloud diagram‐© Amazon, Inc..caged_tower_melbourne.jpg ‐ http://www.flickr.com/photos/vermininc/2227512763

Page 24: Exploring cloud for data warehousing

About the PresenterMark Madsen is president of Third Nature, a technology research and consulting firm focused on businessconsulting firm focused on business intelligence, analytics and information management. Mark is an award-winning author architect andaward winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Markpublications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian pInstitute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages g p gthe open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.

Page 25: Exploring cloud for data warehousing

About Third Nature

Third Nature is a research and consulting firm focused on new and emerging technology and practices in business intelligence, analytics and performance management. If your question is related to BI, analytics, p g y q yinformation strategy and data then you‘re at the right place.

Our goal is to help companies take advantage of information-driven t ti d li ti W ff d ti ltimanagement practices and applications. We offer education, consulting

and research services to support business and IT organizations as well as technology vendors.

We fill the gap between what the industry analyst firms cover and what IT needs. We specialize in product and technology analysis, so we look at emerging technologies and markets e al ating technolog and h it isemerging technologies and markets, evaluating technology and hw it is applied rather than vendor market positions.