all up datawarewhouse – from smp to parallel

38
All up datawarewhouse – From SMP to Parallel Data warehousing

Upload: sriram-jayaraman

Post on 28-May-2015

303 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: All up datawarewhouse – from smp to parallel

All up datawarewhouse – From SMP to Parallel Data warehousing

Page 2: All up datawarewhouse – from smp to parallel
Page 3: All up datawarewhouse – from smp to parallel

UNSTRUCTURED

UNBALANCED

UNPREDICTABLE

Page 4: All up datawarewhouse – from smp to parallel

Take 1 big SANAdd a little ServerAdd a bigger ServerAdd more networking

Page 5: All up datawarewhouse – from smp to parallel

POTENTIAL PERFORMANCE BOTTLENECKS

FCHBA

AB

FCHBA

AB

FC S

WIT

CH

STORAGECONTROLLER

AB

ABCA

CHE

SERV

ER

CACH

ESQ

L SE

RVER

WIN

DO

WS

CPU

CO

RES

CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate

A

BDISK DISK

LUN

DISK DISK

LUN

SQL Server Read Ahead Rate

LUN Read Rate Disk Feed Rate

Page 6: All up datawarewhouse – from smp to parallel

It’s all about …. SIZING

Page 7: All up datawarewhouse – from smp to parallel

One SHOE does not FIT ALL

Page 8: All up datawarewhouse – from smp to parallel

Transaction processing

simplifies and accelerates data

capture for accurate business

decisions

Data warehousing

enables common data model for

single version of the truth

Analysis leads to optimized business processes and improved performance

Page 9: All up datawarewhouse – from smp to parallel

Data Warehouse Scope

Dat

a P

ath

Data Warehouse

Analysis Services Cubes

PerformancePoint

Dedicated SAN, Storage Array

Reporting Services

Web Analytic Tools

Integration Services ETL

SharePoint ServicesMicrosoft Office SharePoint

Data Staging,Bulk Loading

Supporting Systems

BI Data Storage Systems

Presentation Layer Systems

Data Warehouse Scope (dashed)

Pre

sen

tati

on

Dat

aP

rese

nta

tio

n D

ata

Page 10: All up datawarewhouse – from smp to parallel

Data Warehouse Scenarios

• No longer exclusive to large enterprises and specialists analysts

• Growth of affordable self-service BI tools such as PowerPivot and Reporting Services has created a DW requirement for smaller businesses and individual departments

Page 11: All up datawarewhouse – from smp to parallel

Microsoft Data Warehousing Offerings

Scalable and reliable SMP

platform for data warehousing on any hardware

Scalable and reliable platform

for data warehousing on any hardware

Reference architectures

offering best price performance for

data warehousing

Appliance for high end MPP Data Warehousing

delivering highest scalability and performance

Ideal for data marts or small to

mid-sized enterprise data

warehouses (EDWs)

Ideal for large data marts or mid-

sized EDWs

Ideal for data marts or small to mid-sized data

warehouses with scan-centric workloads

Ideal for high scale or high performance data

marts and EDWs

Software only

Integrated Appliance

(Software and Hardware)

Reference Architectures(Software and

Hardware)

DW Appliance(Fully integrated

Software and Hardware)

Scale-Up DW Scale-Up DW Scale-Up DW Scale-Out DW with MPP

10s of terabytes <5 terabytes 5–80 terabytes 10s - 100s of TB

Software Assurance; Premier Mission Critical

Support3-Year Support Plus 24

Software Assurance; Premier Mission Critical

SupportMission Critical

Advantage Program

Enterprise Fast Track Data Warehouse RA

BDW Appliance

Parallel Data

Warehouse

Page 12: All up datawarewhouse – from smp to parallel

Microsoft Data Warehouse Offerings

Effort to Build Very High Very Low Moderate

Moderate

Moderate Moderate

Very Low

Capacity Variable 5 TB 14 TB 20 TB 40 TB 40 TB 500 TB

Concurrency Variable Light Light Medium Medium High Very High

Query Complexity

Variable Medium Medium

Medium Medium High Very High

Page 13: All up datawarewhouse – from smp to parallel

Business Data Warehouse Appliance

Page 14: All up datawarewhouse – from smp to parallel

Business Data Warehouse Appliance

Agile• Deploy in hours/days, not in

months

• Easy to use through built-in dedicated tools to load and manage your data warehouse

• Designed for up to 5TB data warehouses

• Fast Track 3.0 compliant, license path to Fast-Track

Complete• Hardware +

Software + Services• Pre-tuned, pre

configured, pre-installed. Turn on and go!

• Single point of contact for support

Optimized• Specifically for small to

medium data warehouse workload

• Designed for performance, energy efficiency, and value by HP and Microsoft’s best engineers

• Security and reliability built in

Page 15: All up datawarewhouse – from smp to parallel

Scenarios

Small/Departmental Data Warehouse

Spoke in EDW Hub and Spoke Architecture

Page 16: All up datawarewhouse – from smp to parallel

Reference Architectures

Page 17: All up datawarewhouse – from smp to parallel

Fast Track Data Warehouse Components

Software:• SQL Server 2008 R2

Enterprise• Windows Server 2008

R2Configuration guidelines:

• Physical table structures

• Indexes• Compression• SQL Server settings• Windows Server

settings• Loading

Hardware:•Tight specifications for servers, storage and networking•‘Per core’ building block

Processing

Networking

Server

Storage

Page 18: All up datawarewhouse – from smp to parallel

SQL Server Parallel Data Warehouse

Page 19: All up datawarewhouse – from smp to parallel

SQL Server Parallel Data Warehouse• Tier-1 Enterprise Data Warehouse Appliance Offering

– High scalability from tens to hundreds of terabytes– High performance through the MPP system

• Flexibility and Choice – Choice of deployment options through distributed architecture

• Most Comprehensive Solution– Complete data warehouse solution spanning desktop,

enterprise data warehouse, and data marts

Page 20: All up datawarewhouse – from smp to parallel
Page 21: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK DATA RACK

Page 22: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK

Client connections always go through the control node

Contains no persistent user data

Parallel Data Warehouse advantages:

o Processes SQL requests

o Prepares execution plan

o Orchestrates distributed execution

Local SQL Server processes final query plan and aggregates results

Provided by DataDirect

o Open database connectivity (ODBC), object linking and embedding database (OLE DB), Java Database Connectivity (JDBC), and ActiveX® Data Objects (ADO.net) client drivers

o Wire protocol (SeQuel link)

o Drivers are available for 32 bits and 64 bits

CONTROL NODE

Page 23: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK

Provides Support and Patching for the Appliance

Holds image for re-deployment of compute node

Holds Active Directory

MANAGEMENT NODE

Page 24: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK

Provides high-capacity storage for data files from ETL processes

Is available as a sandbox for other applications and scripts that run on the internal network

Provides SQL Server Integration Services

LANDING ZONE

Source

Landing Zone Files

Data Loade

r

Compute Nodes

DWLoader or

SQL Server Integration

Services

Page 25: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

CONTROL RACK

Provides Integrated Backup Solution

Integrates with 3rd party backup option

Orderable in different sizes

BACKUP NODE

Page 26: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

DATA RACK

• Data Rack Servers 10 active + 1 passive

• HP ProLiant DL360 G7 compute nodes

• InfiniBand, FC and Ethernet switching, 42U rack

• Expansion Grow from 1–4 data racks, storage options, test/dev system

• Storage 10x HP StorageWorks MSA P2000 G3

• Consists of COMPUTE NODES and STORAGE NODES

Page 27: All up datawarewhouse – from smp to parallel

SQL

DATA RACK

• Data Rack Servers 10 active + 1 passive

• HP ProLiant DL360 G7 compute nodes

• InfiniBand, FC and Ethernet switching, 42U rack

• Expansion Grow from 1–4 data racks, storage options, test/dev system

• Storage 10x HP StorageWorks MSA P2000 G3

COMPUTE NODE

Each MPP node is a highly tuned symmetric multi-processing (SMP) node with standard interfaces

Provides dedicated hardware, database, and storage

Runs SQL Server

Spare Node provides failover in case of node failure

Drives are configured as RAID 1

Page 28: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

Client Drivers

ETL Load Interface

Support/Patching

Corporate BackupSolution

CONTROL RACK DATA RACK

PDW – Client Connectivity

Page 29: All up datawarewhouse – from smp to parallel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

QUERY

CONTROL RACK DATA RACK

??????

???

?????????

???

???

???

???

???

???

PDW – Query Processing

Page 30: All up datawarewhouse – from smp to parallel

Replicated

A table structure exists as a full copy within each discrete Parallel Data Warehouse node.

Data Layout Approaches

Distributed

A table structure is hashed on a single column and uniformly distributed across all nodes on the appliance. Each distribution is a separate physical table in the database management system (DBMS).

Ultra Shared-Nothing

Provides the ability to design a schema of both distributed and replicated tables to minimize data movement between nodes. Small sets of data can be more efficiently stored in full

(replicated). Certain set operations (such as single-node operations)

are more efficient against full sets of data.

Page 31: All up datawarewhouse – from smp to parallel

Ultra Shared-Nothing Architecture

Extends Traditional Shared-Nothing Design Pushes shared-nothing architecture into the SMP node—there is IO and CPU affinity

within SMP nodeso Eliminates contention for user querieso Uses full resources for each user query

Provides multiple physical instances of tableso Distributes large tableso Replicates small tables

Redistributes rows as needed

Provides Fault Tolerance All hardware components have redundancy (including CPUs, disks, networks, power, and

storage processors) Control and compute nodes use failover clustering Management nodes have active and standby states

Page 32: All up datawarewhouse – from smp to parallel

Administrative Console

https://controlnodeipaddress

Dashboard Query activity Load activity Backup and restore Active locks Active sessions Alerts Appliance state

Page 33: All up datawarewhouse – from smp to parallel

Parallel Data Warehouse Configuration Manager

Appliance topology

Services status Network

configuration Privileges

Page 34: All up datawarewhouse – from smp to parallel

Parallel database copy technology enables rapid data movement and consistency between EDW and data marts

Create SQL Server 2008 R2, Fast Track Data Warehouse,and SQL Server Analysis Services Data Marts

Supports user groups with very different service-level agreements (SLAs):• Performance• Capacity• Loading• Concurrency

Flexible Business Alignment

A distributed architecture gives you the flexibility to add or change diverse workloads

or user groups while maintaining data consistency across the enterprise

Page 35: All up datawarewhouse – from smp to parallel

Landing Zone

ETL Tools

Distributed Data Warehouse ArchitecturesDepartmental Reporting

RegionalReporting

High-Performance

ReportingCentral

EDW Hub

RegionalReporting

with Business Decision

Appliance

Third-Party

RDBMSThird-PartyData

Integration

Mobile Applicat

ions

Page 36: All up datawarewhouse – from smp to parallel

Determining the Right SolutionWhat is the workload? Number of concurrent users Query complexity Query mix Load processing Performance requirements

What is the customer looking for in a solution? Simplicity in the appliance 100 percent compatibility with SQL Server 2008 R2 Enterprise scalability Economical hardware Incremental expansion and high availability by default

Page 37: All up datawarewhouse – from smp to parallel

Parallel Datawarehouse

Enterprise-class scalability to hundreds of terabytes High performance Interoperability with leading BI products Mission critical support and maintenance Mature SQL Server platform with high security and

robust engineering process Strong data warehouse vision and roadmap that includes

industry-leading technologies

Value to Customer

Supporting Features

MPP with ultra shared-nothing architecture Distributed query optimization Balanced hardware with pre-tested and pre-tuned appliances optimized for

data warehousing Third-party product integration (for example, Microstrategy, Business Objects,

and Informatica) Mission critical support and maintenance Road map includes column store, petabyte scalability, real-time data

warehousing, MDM, and data quality

Page 38: All up datawarewhouse – from smp to parallel

© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment

on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.