user group bi
TRANSCRIPT
Franck Sidi Lead SQL Server & Bi – Microsoft Israel
SQL Server Parallel DWH Architecture “Aka : Madison”
Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
Symmetric Multiprocessing
Single DB instance
“Shared Everything” Architecture
Server/CPU’s share
memory
disks
Can lead to resource contention as you scale
SMP
Massively Parallel Processing
Server/CPU’s have their own dedicated resources
“Shared Nothing” Architecture
“Secret Sauce” is parallelizing operations
Lightning-fast Queries, Data Loads and Updates
Linear Scalability
Problem needs to be partitionable
MPP
SMP vs MPP
SMP
HW advancements increasing ability to scale-up
Scaling is limited
High end SMP very expensive
Extremely high concurrency for some workloads
Less than 1-2 TB of data SMP will almost always be better
Full SQL Server functionality
HA must be architected in
MPP
HW advancements increasing ability to scale-up & scale-out
Scaling to 1 PB+
Scale out is relatively low cost
Relatively high concurrency for complex workloads
> 2 TB up to 1 PB
Limited SQL Server functionality
HA is built in
Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
How some solve the problem today
Big SAN
Biggest 64-core Server
Connected together!
What’s wrong with
this picture???
System out of balance
This server can consume 16 GB/Sec of IO, but the SAN can only deliver 2 GB/Sec
Even when the SAN is dedicated to the SQL Data Warehouse, which it often isn’t
Lots of disks for Random IOPS BUT
Limited controllers Limited IO bandwidth
System is typically IO bound and queries are slow
Despite significant investment in both Server and Storage
Where Does an I/O Go?
Understand potential throughput of the hardware
Each component in the path has associated speed/bandwidth
Know where the potential bottlenecks exist
Cach
e
Fron
t End
Po
rts
Co
ntro
llers/Pro
cessors
Switch
Host
Switch
PCI Bus HBA Fiber Channel Ports Array Processors Disks
Potential Performance Bottlenecks
FC
HBA
A
B
FC
HBA
A
B FC
SW
ITC
H
STORAGE
CONTROLLER A
B
A
B CA
CH
E
SE
RV
ER
CA
CH
E
SQ
L S
ER
VE
R
WIN
DO
WS
CP
U C
OR
ES
CPU Feed Rate HBA Port Rate Switch Port Rate SP Port Rate
A
B
DISK DISK
LUN
DISK DISK
LUN
SQL Server
Read Ahead Rate
LUN Read Rate Disk Feed Rate
The alternative: A balanced system
Design a server + storage configuration that can deliver all the IO bandwidth that CPUs can consume when executing a SQL Relational DW workload
Avoid sharing storage devices among servers
Avoid overinvesting in disk drives Focus on scan performance, not IOPS
Layout and manage data to maximize range scan performance and minimize fragmentation
Sequential I/O
Sequential I/O
Ideal for data warehousing
Large reads & writes
Scans on large data stores are usually read with sequential read patterns and not random read patterns
Scalable, predictable performance
Requires 1/3 or fewer drives for same performance
Random I/O
Ideal for OLTP
Small reads and writes
OLTP usually random-read centric Seek queries are a goal in OLTP query optimization
Seeks usually cause random reads
Not as predictable & scalable for data warehousing
Requires large number of drives
All databases contain both scans and seeks among with other types of reads and writes, DW workload indicate that the vast majority of reads are sequential – not all
What is Fast Track Data Warehouse?
A method for designing a cost-effective, balanced system for Data Warehouse workloads Reference hardware configurations developed in conjunction with hardware partners using this method Best practices for data layout, loading and management
Relational Database Only – Not SSAS, IS, RS
Fast Track Scope
Analysis Services
Cubes
PerformancePoint
Reporting Services
Web Analytic Tools
Dedicated SAN,
Storage Array
SharePoint Services
Microsoft Office SharePoint
Data Warehouse
Data Staging,
Bulk Loading
Presentation Layer Systems
Reference Architecture Scope (dashed)
Excel Services
Pre
sen
tati
on
Data
P
resen
tati
on
Data
Lower TCO Minimizes risk of overspending on un-balanced hardware configurations
Commodity Hardware
Choice HW platform
Implementation vendor
Reduced Risk Validated by Microsoft
Encapsulates best practices
Known performance & scalability
Benefits of Fast Track appliance model
Fast Track DW Reference Configurations
Server CPU CPU
Cores SAN Data Drive Count
Initial
Capacity*
Max
Capacity**
HP Proliant
DL 385 G6
(2) AMD Opteron Istanbul
six core 2.6 GHz
12 (3) HP MSA2312fc (24) 300GB 15k
RPM SAS
6TB 12TB
HP Proliant
DL 585 G6
(4) AMD Opteron Instanbul
six core 2.6 GHz
24 (6) HP MSA2312fc (48) 300GB 15k SAS 12TB 24TB
HP Proliant
DL 785 G6
(8) AMD Opteron Istanbul
six core 2.8 GHz
48 (12) HP MSA2312 (96) 300GB 15k SAS 24TB 48TB
Dell PowerEdge R710 (2) Intel Xeon Nehalem
quad core 2.66 GHz
8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB
Dell Power Edge R900 (4) Intel Xeon Dunnington
six core 2.67GHz
24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB
IBM X3650 M2 (2) Intel Xeon Nehalem
quad core 2.67 GHx
8 (2) IBM DS3400 (16) 200GB 15K FC 4TB 8TB
IBM X3850 M2 (4) Intel Xeon Dunnington
six core 2.67 GHz
24 (6) IBM DS3400 (24) 300GB 15k FC 12TB 24TB
IBM X3950 M2 (8) Intel Xeon Nehalem four
core 2.13 GHz
32 (8) IBM DS3400 (32) 300GB 15k SAS 16TB 32TB
Bull Novascale R460
E2
(2) Intel Xeon Nehalem
quad core 2.66 GHz
8 (2) EMC AX4 (16) 300GB 15k FC 4TB 8TB
Bull Novascale R480
E1
(4) Intel Xeon Dunnington
six core 2.67GHz
24 (6) EMC AX4 (48) 300GB 15k FC 12TB 24TB
* Core-balanced compressed capacity based on 300GB 15k SAS not including hot spares and log drives. Assumes 25% (of raw disk space) allocated for Temp DB. ** Represents storage array fully populated with 300GB15k SAS and use of 2.5:1 compression ratio. This includes the addition of one storage expansion tray per enclosure. 30% of this storage should be reserved for DBA operations
SMP Server
per 4-Cores
Per MSA2312 Drive Details • Each MSA can hold 12 drives, this configuration requires 11 • MSA is 2U in total (capacitor eliminates need for battery) • Each MSA SP port controls 4 LUNs • Each pair of LUNs consists of (2) 300GB 15k SAS drives RAID1
Each SP rated at 500MB/s or 1000MB/s for both SP’s
Using 300GB 15k SAS drives each LUN rated at 125MB/s each SP controls 4 LUN’s at 500MB/s or 1000MB/s per MSA DAE
Each SP port rated at 4Gb/s or 400MB/s and 1600MB/s for all 4 SP ports.
Each HBA port rated at 4Gb/s or 400MB/s and 1600MB/s for all 4 SP ports.
SWIT
CH
SP A
SP B
03 04
RAID GP02
LUN3
LUN4
01 02
RAID GP01
LUN1
LUN2
05 06
RAID GP03
LUN5
LUN6
07 08
RAID GP04
LUN7
LUN8
09 10
RAID GP05
LUN0 (Logs)
HS ONLY 8
data
disks !!!
Fast Track DWCore-Balanced Architecture
Fast Track Data Warehouse Components
Software:
• SQL Server 2008 Enterprise
• Windows Server 2008
Hardware:
• Tight specifications for servers, storage and networking
• ‘Per core’ building block
Configuration guidelines:
• Physical table structures
• Indexes
• Compression
• SQL Server settings
• Windows Server settings
• Loading
RA: Tightly Spec'd
RAs include not only hardware but best practices in:
Window OS configuration
SQL Server startup options
Database physical layout
Table types
Indexing
Statistics
Managing fragmentation
Loading procedures
Fast Track Case Study - Results
Teradata SQL Server Fast Track DW
Comparison
Loading – Subject Area 1
5:10:21 total time 51:31 total time R SQL Server 6x faster
Loading – Subject Area 2
4:36:08 total time 1:50.01 total time R SQL Server 2.5x faster
Query times – Subject Area 1
3:03 avg query time (using 9 benchmark queries)
0:15 avg query time (using 9 benchmark queries)
R SQL Server 12x faster
Query times – Subject Area 2
56:44 avg query time (using 4 benchmark queries)
8:09 avg query time (using 4 benchmark queries)
R SQL Server 7x faster
Agenda
Concepts and Principles
Reference Architectures “FastTrack”
Madison functional overview
Early adoption
About DATAllegro…
Industry
Standard
Networking
Proprietary Appliance
Management and
MPP Database
Industry
Standard
Storage
Open Source
Database and OS
Technology Partners
Industry
Standard
Servers
Integration Plans Provide scale out through MPP on SQL Server and Windows
Offer ‘Appliance like’ user experience to Data Warehouse customers
Lower TCO to high end Data Warehousing
Offer integrated BI platform to small and very large Enterprises
OPEN SOURCE DATABASE & OS
Industry Standard Servers
Industry Standard Networking
Industry Standard Storage
MPP Additional Considerations
Principles & approach of SMP carry forward
Deeper level of complexity – High Availability
Parallelization
Inter node data movement
Modular building blocks Balanced CPU and storage
Both SMP and MPP are based on building blocks that scale by the CPU core
Adds network, storage processing and disk bandwidth for each core
Based on maximizing & sustaining true sequential I/O while minimizing disks
Generally changes balance of systems so more can be spent on CPU and SW than on storage to give better overall performance for a given budget
Building blocks can be adjusted for multiple MPP configurations – high performance, archive and extreme performance
The future of SQL Server Data Warehousing – Project "Madison"
Predictable Scale out through MPP
Customers with over 400 TB data warehouses
Commodity Hardware
Lower cost
Frequent performance improvements
Easier upgrade and maintenance
Higher customer comfort
Better compatibility
Ultra Shared Nothing
An extension of traditional shared nothing design
Push shared nothing architecture into SMP node
IO and CPU affinity within SMP nodes Eliminate contention per user query
Use full resources for each user query
Multiple physical instances of tables
Distribute large tables
Replicate small tables
Distribute AND Replicate medium tables
Re-Distribute rows “on-the-fly” when necessary
Madison Server Components
Control
Compute
Storage
Landing Zone
Backup
Management
Failover/Spare
Spare Database Server
Du
al Fib
er
Ch
an
nel
Database Servers
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
Du
al I
nfi
nib
an
d
Control Nodes
Active / Passive
SQL
System Architecture Database Servers
Du
al I
nfi
nib
an
d
Control Nodes
Active / Passive
Spare Database Server
Du
al Fib
er
Ch
an
nel
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL
SQL Client Drivers
ETL Load Interface
Corporate Backup
Solution
Data Center
Monitoring
Corporate Network Private Network
8Gbs Fiber Channel Local San
20Gbs Infiniband DMS Backbone
IPoIB Dedicated LAN
Software Architecture
SQL Server
DW Authentication
DW Configuration
DW Queue
DW Schema
Compute Nodes Compute Nodes
Compute Nodes
Landing Zone
Backup Node
Management Node
Built by DWPU Existing MS software 3rd Party
SQL Server
DMS
User Data
IIS
Nexus Query Tool
JDBC OLE-DB ODBC
Ado.Net
Admin Console
MS BI (AS, RS)
DMS
DMS
Loader Client
SQL SSIS
HPC AD
Madison Service
DSQL Core Engine
Services DMS
Manager
SQL OS
DMS
SQL OS
Control Node & Client Drivers Client connections always go through the control node
Clustered to a passive node
Processes SQL requests
Prepares execution plan
Orchestrates distributed execution
Local SQL Server to do final query plan processing / result aggregation
Will use same set of drivers used by DATAllegro Provided by DataDirect
ODBC, OLE-DB, JDBC and Ado.Net client drivers
Wire protocol (SeQuel Link)
Available drivers for 32 and 64 bits
Compute Nodes
A SQL Server 2008 instance
DB engine nodes autonomous on local data
SQL as primary interface
Each MPP node is a highly tuned SMP node with standard interfaces
Landing Zone Provides high capacity storage for data files from ETL processes
Integration services available on the landing zone
Connected to internal network
Available as sandbox for other applications and scripts that run on internal network.
Source Landing
Zone Files Data
Loader Compute
Nodes
Backup Node
Builds on SQL Server native backup/restore facility
Use VDI interface to plug into backup pipeline
Database-level backup
Coordinated backup across the nodes
Quiesce write activity to synchronize
Can only restore to another appliance with exactly the same number of distributions
Data Distribution & Replication
Compute Nodes Storage Nodes Control Node
Spare Node
Tables Are Hash
Distributed Or
Replicated
Landing Zone Node
Text
File Text
File Text
File Text
File
Date Dim D_DATE_SK D_DATE_ID D_DATE D_MONTH …
Item I_ITEM_SK I_ITEM_ID I_REC_START_DATE I_ITEM_DESC …
Store Sales Ss_sold_date_sk Ss_item_sk Ss_customer_sk Ss_cdemo_sk Ss_store_sk Ss_promo_sk Ss_quantity …
Promotion P_PROMO_SK P_PROMO_ID P_START_DATE_SK P_END_DATE_SK …
Store S_STORE_SK S_STORE_ID S_REC_START_DATE S_REC_END_DATE S_STORE_NAME …
Customer C-CUSTOMER_SK C_CUSTOMER_ID C_CURRENT_ADDR …
Customer Demographics
CD_DEMO_SK CD_GENDER CD_MARITAL_STATUS CD_EDUCATION …
Database Distributed & Replicated Tables
C I
D
CD
S
P
C I
D
CD
S
P C I
D
CD
S
P
C I
D
CD
S
P
C I
D
CD
S
P
C I
D
CD
S
P
C I
D
CD
S
P
C I
D
CD
S
P
SS
SS
SS
SS
SS
SS
SS
SS
Physical Storage Configuration – Single Node LUN 8 LUN 2 LUN 3 LUN 1
Log LUN
UserDB Log TempDB Log StageDB Log
Local Drive 1 Local Drive 2 Local Drive 3 Local Drive 4 Local Drive 5 Local Drive 6
Tem
pD
B
TempDB1.mdf TempDB2.ndf TempDB3.ndf TempDB4.ndf TempDB5.ndf TempDB6.ndf
User
Data
ba
se(s
)
FG Dist B FG Dist C FG Dist H FG Dist A
DistData1.mdf
DistData2.ndf
DistData3.ndf
DistData4.ndf
DistData5.ndf
DistData6.ndf
DistData7.ndf
DistData8.ndf
Replicated FG
ReplData1.mdf
ReplData2.ndf
ReplData3.ndf
ReplData4.ndf
ReplData5.ndf
ReplData6.ndf
ReplData7.ndf
ReplData8.ndf
Sta
gin
g
Data
ba
se
FG Stage B FG Stage C FG Stage H
Replicated FG
FG Stage A
StageData1.mdf
StageData2.ndf
ReplData1.mdf
ReplData2.ndf
ReplData3.ndf
ReplData4.ndf
ReplData5.ndf
ReplData6.ndf
ReplData7.ndf
ReplData.ndf
StageData3.ndf
StageData4.ndf
StageData5.ndf
StageData6.ndf
StageData1.ndf
StageData2.ndf
Create Table – Behind the Scenes
Microsoft Confidential
Create Table store_sales
with
distribute_on (ss_item_sk)
partition_on(ss_sold_date_sk)
cluster_on (ss_sold_date_sk)
Create Table mad_store_sales_a Create Table mad_store_sales_ … Create Table mad_store_sales_h
8K 8K
8K 8K
8K
8 Filegroups
1 Table per FG
Distribution_a
thru
Distribution_h
12 Partitions
(ss_sold_date_sk)
N-number of
Pages
Tuple
Multiple levels of redundancy:
• Leveraging MSCS for node availability
• Cluster aware services:
• SQL Server, Madison, DMS
• Leveraging MSCS for SQL Services, DMS
• 1 spare node for every 8* compute nodes
High Availability
8x1
Security and Encryption Retain DA v3 design
Authentication and authorization done by Madison server Users and Roles as first class principals Nested role capabilities Connection to SQL back-ends through high privilege account SQL nodes reside on private network
No support for integrated auth Leverages TDE to expose DB-level encryption
Supports key rotation
The Logical Data Model
Multiple databases per appliance Each user database maps to one SQL Server db per node
Tables Replicated, Distributed, Replicated + Distributed
Leverage SQL Server compression
Supports Partitioning
Supports secondary indexes
Views
Data Types Most scalar data types supported by SQL Server 2008 are supported by Madison
Main exceptions Character and binary strings limited to 8K (i.e. no BLOB support)
XML
Sql-Variant
System and CLR UDTs
Latin1_General with binary comparison only
SQL Server Data Types DAv3 Madison
bigint P P
binary
bit P
char / nchar P P
date, time P
datetime (was date in DA) P P
datetime2 P
datetimeoffset P
decimal P P
float P P
geometry / geography
hierarchyid
Int (was integer in DA) P P
money P
real P
smalldatetime P
smallint P P
smallmoney P
sql_variant
text / ntext / image
timestamp
tinyint P P
varchar / nvarchar / varbinary P P
v*(max)
uniqueidentifier
xml
Supported SQL Syntax
Aligned with ANSI SQL 92 Basic INSERT, UPDATE, DELETE, SELECT
CREATE TABLE AS SELECT
Limited analytical function support
Teradata extensions Quantile, Sample,…
Configuration and Monitoring
Madison services instrumented Logs and Performance Counters
Capture and forward SNMP alerts from devices within the appliance
Small subset of DMVs to union underlying node DMVs
Leverage HPC for monitoring
Challenge: Is it an appliance or a collection of nodes?
Manageability
Web-based main administrative user interface Based on DATAllegro manageability UI
Monitoring system health and activity
Leveraging HPC pack 2008 Systems management
Monitoring
Cluster health
Query Tools GUI Tool:
Nexus (CoffingDW)
Table & view object explorer
Interactive query execution
Command line tool: Replacement for DA-SQL
Flavor of SqlCmd
MS BI Integration Integration Services
Madison enabled as a source Data movement, lookup operations, etc.
Will add a new SSIS destination Ensure integrated high performance loads
Reporting Services
Fully supported; including parameterized queries Will customize experience for report builder and report designer
Analysis Services
Will get connectivity through OLE-DB provider Will enable both MOLAP and ROLAP storage
High Level Release Definitions
“Madison” (aka v1)
Focus on time to market Compatibility with DATAllegro v3 MS BI integration H1 2010
Closer functional alignment with SQL Server Better integration with SQL and MS ecosystem, tools and technologies
V2+ Will start
running MTPs in the
summer
Recap
Data Warehousing Reference Architectures available today!
SQL Server Fast Track
SQL Server “Madison” Built for advanced, large scale data warehouses Shared-nothing MPP architecture
Early evaluation programs starting soon All feedback welcome:
Thank you!