large-scale sql server deployments for dbas unisys
Post on 19-Dec-2015
223 views
TRANSCRIPT
Large-Scale SQL Server Large-Scale SQL Server Deployments for DBAs Deployments for DBAs
UnisysUnisys
AgendaAgenda
OptimizationOptimization Failover Cluster Failover Cluster ResourcesResources IndexesIndexes
I/O ConfigurationI/O Configuration Data ManagementData Management
Enterprise Class Systems InfrastructureEnterprise Class Systems Infrastructure Intel Platform ArchitectureIntel Platform Architecture Storage ArchitectureStorage Architecture Microsoft Systems ArchitectureMicrosoft Systems Architecture
SQL Server Failover ClusteringSQL Server Failover Clustering
Hot standby solutionHot standby solution
Best high-availability configurationBest high-availability configuration Redundant systemRedundant system Shared access to the database filesShared access to the database files Recovery in secondsRecovery in seconds Automatic failure detectionAutomatic failure detection Automatic failoverAutomatic failover Minimal client application awarenessMinimal client application awareness
Built on the server cluster technology Built on the server cluster technology of Windows Clusteringof Windows Clustering
Windows ClusteringWindows Clustering
Hardware Hardware ComponentsComponents Cluster nodeCluster node HeartbeatHeartbeat External networkExternal network Shared cluster disk Shared cluster disk
arrayarray Quorum driveQuorum drive
Virtual ServerVirtual Server
Software ComponentsSoftware Components Cluster nameCluster name Cluster IP addressCluster IP address Cluster Administrator Cluster Administrator
accountaccount Cluster resourceCluster resource Cluster groupCluster group Microsoft Distributed Microsoft Distributed
Transaction CoordinatorTransaction Coordinator(MS DTC)(MS DTC)
How Failover Clustering WorksHow Failover Clustering Works Operating-system checksOperating-system checks
Heartbeat checks availability of nodes and virtual Heartbeat checks availability of nodes and virtual servers.servers.
SQL Server checksSQL Server checks LooksAlive check runs every five seconds. LooksAlive check runs every five seconds. IsAlive check runs SELECT @@SERVERNAME IsAlive check runs SELECT @@SERVERNAME
query.query.
Failover to another nodeFailover to another node Windows Clustering attempts restart on same node Windows Clustering attempts restart on same node
or fails over or fails over to another node.to another node.
SQL Server service starts.SQL Server service starts. Brings master online.Brings master online. Database recovery proceeds.Database recovery proceeds. End users and applications must reconnect.End users and applications must reconnect.
Enhancements to Failover Enhancements to Failover ClusteringClustering SQL Server Setup installs/uninstalls a cluster.SQL Server Setup installs/uninstalls a cluster.
Service packs can be applied directly to virtual servers.Service packs can be applied directly to virtual servers.
SQL Server supports SQL Server supports multiple instances and multiple multiple instances and multiple network addressesnetwork addresses..
Failover and failback occur to or from any node in a cluster.Failover and failback occur to or from any node in a cluster.
SQL Server 2000 on Windows 2000 Datacenter Server SQL Server 2000 on Windows 2000 Datacenter Server supports four server nodes in a cluster.supports four server nodes in a cluster.
All nodes have local copies of SQL Server tools and All nodes have local copies of SQL Server tools and executables.executables.
Rerunning Setup updates Failover Cluster configurations.Rerunning Setup updates Failover Cluster configurations.
SQL Server Service Manager or SQL Server Enterprise SQL Server Service Manager or SQL Server Enterprise Manager now start/stop SQL Server Services.Manager now start/stop SQL Server Services.
SQL Server 2000 Failover SQL Server 2000 Failover ClusteringClustering
Virtual Server InstanceVirtual Server Instance System & User DatabasesSystem & User Databases SQL Server & SQL Server SQL Server & SQL Server
Agent ServicesAgent Services ExecutablesExecutables Network ConnectionNetwork Connection
SQL Server 2000 Virtual Server
MSCS MSCS
SharedDisk Array
Node A Node B
Heartbeat
Public Network
Instance ExampleInstance Example
Directory Structure Example
Tools
Default Instance
Instance 1Bin/Data
Instance 2 Bin/Data
Four-Node ClustersFour-Node Clusters
SQL Server 2000 fully supports 4-node SQL Server 2000 fully supports 4-node failover clusteringfailover clustering Can assume max 1 node failure and assign Can assume max 1 node failure and assign
resources appropriately (e.g. leave one node out resources appropriately (e.g. leave one node out of four idle of four idle 25% redundancy (improves on 25% redundancy (improves on Windows 2000 Advanced Server 2-node 50% Windows 2000 Advanced Server 2-node 50% redundancy)redundancy)
When running Active/Active/Active/Active for When running Active/Active/Active/Active for critical workloads ensure that one node has the critical workloads ensure that one node has the power/resources to handle the worst case (i.e. power/resources to handle the worst case (i.e. memory) memory)
SQL Server 2000SQL Server 2000Cluster Storage ConsiderationsCluster Storage Considerations Fibre Channel connected to SAN is the Fibre Channel connected to SAN is the
preferred approachpreferred approach
Network Attached Storage (NAS)Network Attached Storage (NAS) Not supported for clustersNot supported for clusters (Q) 304261(Q) 304261
““Because of the risks of network errors compromising Because of the risks of network errors compromising database integrity, together with possible performance database integrity, together with possible performance implications that may result from the use of network file implications that may result from the use of network file shares to store databases, Microsoft recommends that shares to store databases, Microsoft recommends that you store database files either on local disk subsystems you store database files either on local disk subsystems or on Storage Area Networks (SANs).”or on Storage Area Networks (SANs).”
SQL Server 2000SQL Server 2000Cluster Storage ConsiderationsCluster Storage Considerations Basic Disks are required by MSCS Basic Disks are required by MSCS
(Microsoft Cluster Service)(Microsoft Cluster Service) Dynamic Disk & Mount Points are not natively Dynamic Disk & Mount Points are not natively
supported (Veritas required)supported (Veritas required) Software RAID is not supportedSoftware RAID is not supported Balance letters of the alphabet vs. placement Balance letters of the alphabet vs. placement
of:of: Log filesLog files Data filesData files FilegroupsFilegroups tempdbtempdb
Cluster Fail-over Memory Cluster Fail-over Memory & CPU Configuration Planning& CPU Configuration Planning
Would all contents fit Would all contents fit into one glass?into one glass?
• Memory ResourcesMemory Resources• CPU ResourcesCPU Resources
If you poured contents from one glass into another…If you poured contents from one glass into another…
64-Bit dynamically 64-Bit dynamically manages memory manages memory
ResourcesResources
MemoryMemory Very Large Memory SupportVery Large Memory Support AWEAWE AWE & I/OAWE & I/O SettingsSettings
Processor AllocationProcessor Allocation User Mode SchedulerUser Mode Scheduler Settings Settings AffinityAffinity
Windows 2000Windows 2000Very Large Memory SupportVery Large Memory Support Real Memory - 4GB Limit & /3GB switchReal Memory - 4GB Limit & /3GB switch
Physical Addressing Extension (PAE)Physical Addressing Extension (PAE)
Address Windowing Extensions (AWE)Address Windowing Extensions (AWE)
Large Memory Enabled (LME)Large Memory Enabled (LME)
AWE I/O ImplicationsAWE I/O Implications
Much simpler in 64 bitMuch simpler in 64 bit
Windows Product OptionsWindows Product Options
ProductProduct 3GB3GB PAE/AWEPAE/AWE LimitLimit
Windows 2000 Windows 2000 ServerServer
NoNo N/AN/A 4GB4GB
Windows 2000 Windows 2000 Advanced Advanced ServerServer
YesYes YesYes 8 GB8 GB
Windows 2000 Windows 2000 Datacenter Datacenter ServerServer
YesYes
NoNo
YesYes
YesYes
16 GB16 GB
32 (64) GB32 (64) GB
Memory Management 101Memory Management 101
Windows 2000 Kernel2GB or 3GB
4G4G
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
4GB
4GB
Windows 2000 Kernel2GB or 3GB
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
Process Address SpacesProcess Address Spaces
SQL Instance1SQL Instance1SQL Instance2SQL Instance2
SQL Instance3
4GB
4GB
Virtual Memory And AWEVirtual Memory And AWE
Windows 2000, a 32-bit Windows 2000, a 32-bit Operating System, can Operating System, can only address 2only address 23232 bytes bytes (4GB)…(4GB)…
So how do we support So how do we support more memory?more memory?
The next few slides look The next few slides look at how AWE enables SQL at how AWE enables SQL Server to use 64GBServer to use 64GB
AWE Memory ModelAWE Memory Model AWE Window resides in real memory (below AWE Window resides in real memory (below
4GB)4GB)
Data pages mapped to reserved bufferpool Data pages mapped to reserved bufferpool above 4GBabove 4GB
Applications call AWE API to get data pagesApplications call AWE API to get data pages Data pages referenced or mappedData pages referenced or mapped Allows huge amounts of data to be cachedAllows huge amounts of data to be cached Look for Page Life Expectancy counter in perfmonLook for Page Life Expectancy counter in perfmon AWE provides a 64GB address space to SQL ServerAWE provides a 64GB address space to SQL Server
Largest benefits with systems that have large Largest benefits with systems that have large amounts of frequently referenced pages (“hot” amounts of frequently referenced pages (“hot” pages)pages) Note: SQL Server-based applications that one-time Note: SQL Server-based applications that one-time
scan through large tables and/or have large scan through large tables and/or have large intermediate results sets may not benefit from much intermediate results sets may not benefit from much larger bufferpoollarger bufferpool
AWE Memory WindowAWE Memory Window
1. Physical Memory1. Physical Memory is allocated is allocated
SQL Instance1Mapped Virtual
Memory Allocation(Above 4GB)
AWE Memory AllocationAWE Memory Allocation
Windows 2000 Kernel2GB or 3GB
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
SQL Instance1Mapped Virtual
Memory Allocation(Above 4GB)
AWE Memory WindowAWE Memory Window
AWE Memory ReallocationAWE Memory Reallocation
Windows 2000 Kernel2GB or 3GB
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
1) Mapping Removed 2) New
Mapping
SQL Server Use of AWE MemorySQL Server Use of AWE Memory
Using AWE means that SQL Server 2000 Using AWE means that SQL Server 2000 memory is no longer dynamic (i.e. it can memory is no longer dynamic (i.e. it can no longer be relinquished if the system is no longer be relinquished if the system is under stress)under stress) Caution when using AWE and SQL Server Caution when using AWE and SQL Server
with other workloads – configure memory with other workloads – configure memory appropriately or SQL Server will allocate appropriately or SQL Server will allocate total available memory (minus 128MB)total available memory (minus 128MB)
Caution when running in a cluster - allow Caution when running in a cluster - allow enough memory for failover on each nodeenough memory for failover on each node
AWE and I/OAWE and I/O
Applications can only Applications can only read/write to AWE Memory read/write to AWE Memory WindowWindow
Two methods of achieving Two methods of achieving this:this: With LME hardwareWith LME hardware
(hardware can access (hardware can access mapped pages anywhere in mapped pages anywhere in memory)memory)
Without LME hardwareWithout LME hardware(hardware can only access (hardware can only access mapped pages from <4GB)mapped pages from <4GB)
SQL Instance1Mapped Virtual
Memory Allocation(Above 4GB)
AWE I/O AWE I/O WithWith LME LMERequired for Datacenter CertificationRequired for Datacenter Certification
Windows 2000 Kernel2GB or 3GB
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
AWE Memory WindowAWE Memory Window
NetworkNetwork
Disk
Direct I/O
SQL Instance1Mapped Virtual
Memory Allocation(Above 4GB)
AWE I/O AWE I/O WithoutWithout LME LME
Windows 2000 Kernel2GB or 3GB
Virtual MemoryVirtual Memory Physical MemoryPhysical Memory
AWE Memory WindowAWE Memory Window
NetworkNetwork
Disk
Double-buffered
I/O
Large Memory SettingsLarge Memory Settings
To grab 4GB or less:To grab 4GB or less: No /PAE needed in boot.iniNo /PAE needed in boot.ini /3GB available to use to grab 3 out of 4 GB of real /3GB available to use to grab 3 out of 4 GB of real
memory.memory. AWE enabled but doesn’t do anything for SQL ServerAWE enabled but doesn’t do anything for SQL Server
To grab >4gb to 16gbTo grab >4gb to 16gb /PAE enabled in boot.ini/PAE enabled in boot.ini /3GB enabled in boot.ini/3GB enabled in boot.ini AWE enabledAWE enabled
To grab > 16gbTo grab > 16gb /PAE enabled/PAE enabled AWE enabledAWE enabled /3GB disabled/3GB disabled
Memory ConfigurationMemory ConfigurationSettingsSettings Min Server Memory (MB)Min Server Memory (MB)
SQL Server will maintain this value once it’s SQL Server will maintain this value once it’s committedcommitted
Not automatically allocated on startup, unless AWE is Not automatically allocated on startup, unless AWE is enabledenabled
Max Server Memory (MB)Max Server Memory (MB) Becomes SQL Server target memory unless Becomes SQL Server target memory unless
(Available Bytes – 5 MB) less than Max Server (Available Bytes – 5 MB) less than Max Server Memory.Memory.
Buffer pages will continue to be committed until Buffer pages will continue to be committed until target reached.target reached.
Under memory pressure SQL Server will pay out Under memory pressure SQL Server will pay out buffer pool until Min Server Memory reached.buffer pool until Min Server Memory reached.
Set to same as min when AWE enabledSet to same as min when AWE enabled
AWE EnabledAWE Enabled Enable AWE window for SQL ServerEnable AWE window for SQL Server
Memory ConfigurationMemory ConfigurationSettings (cont.)Settings (cont.) LocksLocks
Sets maximum number of locksSets maximum number of locks When set to 0, 2% of memory allocated to pool of When set to 0, 2% of memory allocated to pool of
locks initially.locks initially. Dynamic lock pool do not exceed 40% of total server Dynamic lock pool do not exceed 40% of total server
memory allocationmemory allocation
Set Working Set SizeSet Working Set Size Reserve physical memory space for SQL Server Reserve physical memory space for SQL Server
instead of being swapped out, set min and max to instead of being swapped out, set min and max to same value when enable this option. same value when enable this option.
Min memory per query (KB)Min memory per query (KB) Sets minimum memory that must be met by that Sets minimum memory that must be met by that
Memory Grant Manager to avoid query waitingMemory Grant Manager to avoid query waiting
query wait (s)query wait (s) Wait time for Memory Grant Manager to acquire min Wait time for Memory Grant Manager to acquire min
memory per query before query times out.memory per query before query times out.
ResourcesResources
MemoryMemory Very Large Memory SupportVery Large Memory Support AWEAWE AWE & I/OAWE & I/O SettingsSettings
Processor AllocationProcessor Allocation User Mode SchedulerUser Mode Scheduler Settings Settings AffinityAffinity
User Mode SchedulerUser Mode Scheduler
UMS Scheduler
Work Requests
Network
Query Results
CPU nCPU 1 CPU 2CPU 0
Worker Pool
Workers
Worker Pool
Workers
Worker Pool
Workers
Worker Pool
Workers
UMS Scheduler
UMS Scheduler
UMS Scheduler
Workers in “Runnable” status assigned to CPU by UMS Scheduler
Network
UMS Work
Queue
UMSWork
Queue
UMSWork
Queue
UMSWork
Queue
Work requests assigned to workers in pool
Context SwitchingContext Switching
No real multiprocessing, just time slicingNo real multiprocessing, just time slicing
SQL Server User Mode SchedulerSQL Server User Mode Scheduler Per processorPer processor Scheduler assigns processorScheduler assigns processor
Performance FactorsPerformance Factors Perfmon: Context switches/sec (10-15K bad)Perfmon: Context switches/sec (10-15K bad) CPUs have L2 cacheCPUs have L2 cache CPU Affinity maskCPU Affinity mask Lightweight poolingLightweight pooling
Related SQL Server Related SQL Server Configuration SettingsConfiguration Settings Max worker threadsMax worker threads
Sets maximum workers to service work requestsSets maximum workers to service work requests Assigned evenly across CPUs available to UMSAssigned evenly across CPUs available to UMS
Lightweight poolingLightweight pooling When set, workers are created as NT fibersWhen set, workers are created as NT fibers One thread per CPU manages subset of fibersOne thread per CPU manages subset of fibers
Priority boost Priority boost If set, workers execute at NT priority 14If set, workers execute at NT priority 14 By default, workers execute at priority 7-8By default, workers execute at priority 7-8
Max degree of parallelism Max degree of parallelism Number of processors considered for parallel query execution planNumber of processors considered for parallel query execution plan
Cost threshold for ParallelismCost threshold for Parallelism Minimum execution cost before optimizer creates parallel execution Minimum execution cost before optimizer creates parallel execution
planplan
• SQL Server Affinity MasksSQL Server Affinity Masks
CPU AffinityCPU Affinity Reduce context switchingReduce context switching Increase cache hitsIncrease cache hits
New SQL Server 2000 SP1 Features!New SQL Server 2000 SP1 Features! I/O AffinityI/O Affinity Connection Affinity (VIA only)Connection Affinity (VIA only)
SQL Server CPU AffinitySQL Server CPU Affinity
IO AffinityIO Affinity
Crossbar Intra-connect
TLC
CPU
CPU
CPU
CPU
TLC
CPU
CPU
CPU
CPU
Crossbar Intra-connect
Crossbar Intra-connect
TLC
CPU
CPU
CPU
CPU
TLC
CPU
CPU
CPU
CPU
TLC
CPU
CPU
CPU
CPU
TLC
CPU
CPU
CPU
CPU
I/OI/O I/OI/O I/OI/O
Crossbar Intra-connect
I/OI/O
MSU MSU MSUMSU
TLCTLC
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
Storage Area NetworkStorage Area Network
Example: Set IO Affinity to Processor x0000000F Example: Set IO Affinity to Processor x0000000F Sends SQL Server disk I/O to affinitized processors. Sends SQL Server disk I/O to affinitized processors.
( ( An extreme high end feature An extreme high end feature ))
Index OverviewIndex Overview
Clustered and Non-Clustered IndexesClustered and Non-Clustered Indexes
Covering IndexesCovering Indexes
Unique IndexesUnique Indexes
Indexed ViewsIndexed Views
Index MaintenanceIndex Maintenance
RecommendationsRecommendations
Clustered IndexesClustered Indexes
A-C D-E F-G H-J K-M N-O P-Q R-S T-V W-Z
------------------ Root Page
.
..
-------------
-------------
Non-leaf pages
--- Leaf Pages
Example: - “select * from table where lastname like ‘S%’ - Assume 100 names start with ‘S’
- Assume 100+ rows per page - Clustered index will issue 4 in this example
- contrast with nonclustered noncovering heap table
Clustered IndexesClustered IndexesChoosing a clustered keyChoosing a clustered key Clusterkey is row locator in all non-clustered indexesClusterkey is row locator in all non-clustered indexes
Increase potential for covering queriesIncrease potential for covering queries Size consideration – long composite clusterkeys may not be appropriateSize consideration – long composite clusterkeys may not be appropriate Small clusterkeys (smallint or INT) save space in non-clustered index:Small clusterkeys (smallint or INT) save space in non-clustered index:
INT = 4 byte row locator stored in non-clustered index entryINT = 4 byte row locator stored in non-clustered index entry RID = 6 byte row locator stored in non-clustered index entryRID = 6 byte row locator stored in non-clustered index entry
Update frequencyUpdate frequency Clusterkey columns must be stableClusterkey columns must be stable
Updates are expensiveUpdates are expensive Cost increases with number of non-clustered indexesCost increases with number of non-clustered indexes
Continuously ascending clusterkey value is most efficient model for Continuously ascending clusterkey value is most efficient model for high volume insert:high volume insert:
CREATE table orders CREATE table orders (Order_ID INT IDENTITY PRIMARY KEY CLUSTERED,(Order_ID INT IDENTITY PRIMARY KEY CLUSTERED, ............................................................................................
No hot spot contentionNo hot spot contention No page splittingNo page splitting Suitable as a default clustered indexSuitable as a default clustered index
Non-clustered IndexesNon-clustered Indexes
BranchBranch
Root PageRoot Page
Leaf
ANDERSON............................................................
ANDERSON............................................................
ANDERSON 4:300JOHNSON 4:301SIMONSON 4:302
ANDERSON 4:300JOHNSON 4:301SIMONSON 4:302
ANDERSON 2:100BENSON 2:101JOHNSEN 2:102
ANDERSON 2:100BENSON 2:101JOHNSEN 2:102
JOHNSON 2:103KRANSTEN 2:104MILLER 2:105
JOHNSON 2:103KRANSTEN 2:104MILLER 2:105
SIMONSON 2:106TANNER 2:107
SIMONSON 2:106TANNER 2:107
BENSON............................................................
BENSON............................................................
JOHNSEN............................................................
JOHNSEN............................................................
JOHNSON............................................................
JOHNSON............................................................
KRANSTEN....................
KRANSTEN....................
MILLER............................................................
MILLER............................................................
SIMONSON............................................................
SIMONSON............................................................
TANNER............................................................
TANNER............................................................
Page 4:301Page 4:300 Page 4:302
Page 2:100 Page 2:101 Page 2:102 Page 2:103 Page 2:104 Page 3:106Page 2:105 Page 2:107
UPDATE CUSTOMERSET CUSTOMER_COMMENT = ‘.......’WHERE CUSTOMER NAME = ‘KRANSTEN‘
Heap (Data Pages)....................
..........→......................
....................
..........→......................
....................
....................
....KRANSTEN...
....................
....................
....................
....KRANSTEN...
....................
Forwarding pointer to new row location; avoids index update when RID changes
Example: Non-clustering index Example: Non-clustering index carrying clustering key:carrying clustering key:
Root PageRoot Page
Non-LeafNon-LeafPagesPages
Leaf Leaf PagesPages
…………....
ClusteringClusteringKeyKey
NoteNote: NC affected by all Clustering: NC affected by all ClusteringKey operations including Reindex,Key operations including Reindex,Drop/Create, etc.Drop/Create, etc.
Example: Drop Clustering Key, Example: Drop Clustering Key, change NC key values to RIDs.change NC key values to RIDs.
Root PageRoot Page
Non-LeafNon-LeafPagesPages
Leaf Leaf PagesPages
NoteNote: Change clustering key and : Change clustering key and You change all non-clusteringYou change all non-clusteringIndexes automaticallyIndexes automatically
DataDataPagesPages
RIDRID
Non-clustered IndexesNon-clustered IndexesComposite Index ConsiderationsComposite Index Considerations
Leading column densityLeading column density CREATE UNIQUE INDEX IX_PRODUCT_TYPE_BRAND_NAME CREATE UNIQUE INDEX IX_PRODUCT_TYPE_BRAND_NAME ON PRODUCTON PRODUCT (PRODUCT_TYPE, (PRODUCT_TYPE, PRODUCT_BRAND, PRODUCT_BRAND, PRODUCT_NAME) PRODUCT_NAME)
Important that statistics exist on all columns of indexImportant that statistics exist on all columns of index sp_createstats ‘indexonly’sp_createstats ‘indexonly’
Using left-most subset of columns not absolutely necessary on non-Using left-most subset of columns not absolutely necessary on non-clustered indexes:clustered indexes:
Low density of interior column (and existing stats) can use index scan Low density of interior column (and existing stats) can use index scan WHERE argument to filter prior to bookmark lookup:WHERE argument to filter prior to bookmark lookup:
SELECT * FROM PRODUCT WHERE PRODUCT_BRAND = 'XYZ'SELECT * FROM PRODUCT WHERE PRODUCT_BRAND = 'XYZ'StmtText StmtText
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([orders].[dbo].|--Bookmark Lookup(BOOKMARK:([Bmk1000]), OBJECT:([orders].[dbo].
[PRODUCT]))[PRODUCT]))
|--Index Scan(OBJECT:(IX_PRODUCT_TYPE_BRAND_NAME), WHERE:|--Index Scan(OBJECT:(IX_PRODUCT_TYPE_BRAND_NAME), WHERE:(PRODUCT_BRAND='XYZ‘ ))(PRODUCT_BRAND='XYZ‘ ))
If leading column used alone in search argument, density is crucial
If columns used together (as search argument) leading column density less important than composite All Density
Covering IndexesCovering Indexes
Materializes query result from a non-clustered indexMaterializes query result from a non-clustered index All necessary data must be in the indexAll necessary data must be in the index Data pages are not touched (no bookmark lookup)Data pages are not touched (no bookmark lookup) References to clustered key columns are available and used References to clustered key columns are available and used
from the non-clustered indexfrom the non-clustered index A composite index is useful even if the first column is not A composite index is useful even if the first column is not
referencedreferenced
SELECT SELECT SUPPLIER_ID, ORDER_ID, COUNT(*)SUPPLIER_ID, ORDER_ID, COUNT(*)
FROM FROM ORDER_ITEMORDER_ITEM
WHERE WHERE YEAR(SHIP_DATE) = 1993YEAR(SHIP_DATE) = 1993
GROUP BY GROUP BY SUPPLIER_ID, ORDER_IDSUPPLIER_ID, ORDER_ID
ORDER BY ORDER BY SUPPLIER_ID, ORDER_IDSUPPLIER_ID, ORDER_ID
StmtTextStmtText
---------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------
|--Compute Scalar(DEFINE:([Expr1002]=Convert([Expr1006])))|--Compute Scalar(DEFINE:([Expr1002]=Convert([Expr1006])))
|--Stream Aggregate(GROUP BY:(SUPPLIER_ID, ORDER_ID) DEFINE:([Expr1006]=Count(*)))|--Stream Aggregate(GROUP BY:(SUPPLIER_ID, ORDER_ID) DEFINE:([Expr1006]=Count(*)))
|--Sort(ORDER BY:(SUPPLIER_ID ASC, ORDER_ID ASC))|--Sort(ORDER BY:(SUPPLIER_ID ASC, ORDER_ID ASC))
|--Index Scan(OBJECT:(IX_ORDER_ITEM_SHIP_DATE), WHERE:(datepart(year, Convert(SHIP_DATE))=1993))|--Index Scan(OBJECT:(IX_ORDER_ITEM_SHIP_DATE), WHERE:(datepart(year, Convert(SHIP_DATE))=1993))
Covering IndexesCovering IndexesConsiderationsConsiderations Queries covered by indexes often found during trivial Queries covered by indexes often found during trivial
plan phaseplan phase Reduced compilation timeReduced compilation time
Multiple indexes can be usedMultiple indexes can be used Index rows retrieved from non-clustered indexesIndex rows retrieved from non-clustered indexes Joined on row locator (clusterkey or RID)Joined on row locator (clusterkey or RID) Not a common access methodNot a common access method
Clustered keys always available (exist in row locator)Clustered keys always available (exist in row locator) Can reduce some lock contentionCan reduce some lock contention
Data page / row are not accessed; not lockedData page / row are not accessed; not locked
Avoid exaggerating non-clustered key lengths to force Avoid exaggerating non-clustered key lengths to force covering queriescovering queries Longer key increases index sizeLonger key increases index size Volatile columns cause lock contention in index leafVolatile columns cause lock contention in index leaf
BreakBreak
Data Management OverviewData Management Overview
I/O strategiesI/O strategies
Capacity Planning GuidelinesCapacity Planning Guidelines
RAIDRAID
File SystemFile System
Database and Object Placement Database and Object Placement StrategiesStrategies
SANSAN
MonitoringMonitoring
Understanding your DisksUnderstanding your Disks
Check the numbersCheck the numbers
There is a difference in speed for new There is a difference in speed for new drivesdrives
Numbers from HW vendors are perfect Numbers from HW vendors are perfect worldworld
Rotational Rotational SpeedAvgSpeedAvg
LatenLatency cy (ms)(ms)
Avg Avg Seek Seek (ms)(ms)
Avg Avg Access Access Time Time (ms)(ms)
RandoRandom I/O’s m I/O’s per sec per sec
(8K)(8K)
Sequential Sequential I/O’s per I/O’s per sec(64K)sec(64K)
36003600 8.38.3 1313 21.321.3 4545 8585
54005400 5.55.5 1111 16.516.5 6060 115115
72007200 4.24.2 7.17.1 11.311.3 9090 175175
1000010000 33 5.45.4 8.48.4 120120 230230
Workload ComparisonWorkload Comparison
OLTP environmentOLTP environment Short transactionsShort transactions Random read and write I/ORandom read and write I/O 50-60% read, 40-50% write a good start point of 50-60% read, 40-50% write a good start point of
workload breakdownworkload breakdown
DSS, OLAP environmentDSS, OLAP environment Mostly long transactionsMostly long transactions Many sequential I/O, typically readsMany sequential I/O, typically reads 75-80% read, 20-25% write is a good start point 75-80% read, 20-25% write is a good start point
Capacity Planning ApproachCapacity Planning Approach
Minimally Minimally Size of database / size of diskSize of database / size of disk Table sizing tool in W2K resource kit and Table sizing tool in W2K resource kit and
http://toolboxhttp://toolbox Not a good approach for performanceNot a good approach for performance
GuesstimateGuesstimate Estimated data throughput / size of diskEstimated data throughput / size of disk Compare with the minimal approach and take the Compare with the minimal approach and take the
larger # (of disks) of the two.larger # (of disks) of the two.
Sampled approachSampled approach Take a small sample workload and monitor on a Take a small sample workload and monitor on a
system with tempdb, log and datafile placed on system with tempdb, log and datafile placed on separate spindles. Using perfmon to capture the I/O separate spindles. Using perfmon to capture the I/O statistics and calculate the spindle requirements statistics and calculate the spindle requirements under real production load.under real production load.
Capacity PlanningCapacity PlanningOLTP PlanningOLTP Planning Random read/writesRandom read/writes
Plan on more random reads and writes or 80 Plan on more random reads and writes or 80 I/Os per secondI/Os per second
Turn on controller cache to match read / Turn on controller cache to match read / write ratio. Example for read intensive write ratio. Example for read intensive applications maybe 75% read 25% writeapplications maybe 75% read 25% write
Larger number of smaller drives = better Larger number of smaller drives = better performanceperformance
Put log on its own controller with 100% Put log on its own controller with 100% write on controller (except replication write on controller (except replication environment)environment)
Capacity PlanningCapacity PlanningData WarehouseData Warehouse Sequential ReadSequential Read
Plan on 120 I/Os/second per drive due to Plan on 120 I/Os/second per drive due to scansscans
Set controller for heavy read through Set controller for heavy read through cachecache
In general, spread data and index across In general, spread data and index across many drives, don’t be a hero and guess many drives, don’t be a hero and guess where to place objectswhere to place objects
Consider striping Tempdb separatelyConsider striping Tempdb separately
Log less importantLog less important
RAID IntroductionRAID Introduction
RAID 10RAID 10
RAID 5RAID 5
RAID 1RAID 1
RAID 0RAID 0
Data 1Data 4Data 7Parity
Data 1Data 4Data 7Parity
Data 3ParityData 8Data 11
Data 3ParityData 8Data 11
ParityData 6Data 9Data 12
ParityData 6Data 9Data 12
Data 2Data 5ParityData 10
Data 2Data 5ParityData 10
Data 1Data 5Data 9 Data 13
Data 1Data 5Data 9 Data 13
Data 3Data 7Data 11Data 15
Data 3Data 7Data 11Data 15
Data 4Data 8Data 12Data 16
Data 4Data 8Data 12Data 16
Data 2Data 6Data 10Data 14
Data 2Data 6Data 10Data 14
Data 1Data 2Data 3 Data 4
Data 1Data 2Data 3 Data 4
Data 1Data 2Data 3 Data 4
Data 1Data 2Data 3 Data 4
Data 1Data 3Data 5 Data 7
Data 1Data 3Data 5 Data 7
Data 2Data 4Data 6 Data 8
Data 2Data 4Data 6 Data 8
Data 1Data 3Data 5 Data 7
Data 1Data 3Data 5 Data 7
Data 2Data 4Data 6 Data 8
Data 2Data 4Data 6 Data 8
Mirror StripeStripe
Mirror
Stripe
Stripe
Which RAID to ChooseWhich RAID to Choose
Using Sequential read/write operationsUsing Sequential read/write operations RAID 0 RAID 0 - 1 read/1 write (- 1 read/1 write (No No
redundancyredundancy)) RAID 1/RAID 10 RAID 1/RAID 10 - 1 split read/2 writes- 1 split read/2 writes RAID 5RAID 5 - 2 reads/2 writes- 2 reads/2 writes
RAID 0
RAID 5RAID 1M
B/s
ec
RAID 10
RAID 5 vs. RAID 10 PerformanceRAID 5 vs. RAID 10 PerformanceI/Os Per Second
624
480589
1,079
940
813
0
200
400
600
800
1000
1200
2K Random Read / Write 8K Random Read / Write 64K Sequential Read
Megabytes Per Second
1.23.8
36.8
2.1
7.3
50.8
0.0
10.0
20.0
30.0
40.0
50.0
2K Random Read / Write 8K Random Read / Write 64K Sequential Read
RAID 5 RAID 10
Iometer version 1998.10.08, Copyright1997-1998 by Intel Corporation
RAID RecommendationsRAID Recommendations RAID 10 is bestRAID 10 is best
Very high read-write performanceVery high read-write performance Higher theoretical fault toleranceHigher theoretical fault tolerance
RAID 5 as last resortRAID 5 as last resort Highest usable capacity Highest usable capacity Moderate read performanceModerate read performance Poor write performancePoor write performance
Stripe ConfigurationStripe Configuration 64K Stripe Unit 64K Stripe Unit 256K Stripe Depth 256K Stripe Depth
Data 1Data 5Data 9 Data 13
Data 1Data 5Data 9 Data 13
Data 3Data 7Data 11Data 15
Data 3Data 7Data 11Data 15
Data 4Data 8Data 12Data 16
Data 4Data 8Data 12Data 16
Data 2Data 6Data 10Data 14
Data 2Data 6Data 10Data 14
StripeUnit
StripeWidth
Raid Implementation in VLDBRaid Implementation in VLDB
Hardware or Software RaidHardware or Software Raid Performance considerationPerformance consideration Cluster supportCluster support
Battery backup for controller cache a mustBattery backup for controller cache a must
Some controller performance are Some controller performance are optimized with number of disks.optimized with number of disks.
IDE RaidIDE Raid Good sequential performanceGood sequential performance Lack of scalabilityLack of scalability
File System TopicsFile System Topics
NTFS vs. FAT32NTFS vs. FAT32 NTFS required for cluster supportNTFS required for cluster support FAT32 has file size limitFAT32 has file size limit
Allocation unit sizeAllocation unit size
Stripe size must be multiple of 8K (SQL Server Stripe size must be multiple of 8K (SQL Server I/O), 4K block (NT I/O) and 512Bytes (Disk I/O)I/O), 4K block (NT I/O) and 512Bytes (Disk I/O)
64K size is optimum for most SQL Server 64K size is optimum for most SQL Server implementationimplementation
Larger stripe size (128K, 256K) may be used for Larger stripe size (128K, 256K) may be used for data warehouse environmentdata warehouse environment
I/O load BalancingI/O load Balancing
PCIBus
PCIBus
PCIBus
System Bus
Network
Distribute spindles in RAID set across available Channels
Understand your bus architecture – know the point of diminishing returns on spindles per bus
Do not saturate one PCI bus and underutilize another – balance your work load. Monitor!
Factor in network I/O requirements
Understand theoretical maximums versus actual, sustainable throughput capability in all bus architectures
IO Utilization Object PlacementIO Utilization Object Placement
Where to place my objects?Where to place my objects? Log should be on its own mirrored drive or drivesLog should be on its own mirrored drive or drives OS bits, SQL bits and pagefile on separate driveOS bits, SQL bits and pagefile on separate drive
Use multiple filegroupsUse multiple filegroups Large and heavy hit table and its indices should be Large and heavy hit table and its indices should be
placed in separate filegroup for both performance and placed in separate filegroup for both performance and management reasonsmanagement reasons
Place table (data), index, and tempdb on separate Place table (data), index, and tempdb on separate filegroup and spindle to promote sequential I/O filegroup and spindle to promote sequential I/O during table/index scan, e.g. index creation (using during table/index scan, e.g. index creation (using sort_in_tempdb option)sort_in_tempdb option)
Multiple datafiles can be used for tempdb if usage is Multiple datafiles can be used for tempdb if usage is high.high.
Database Backups should go to separate spindles Database Backups should go to separate spindles due to the highly sequential nature of backup and due to the highly sequential nature of backup and restore operation. Performance improvement can be restore operation. Performance improvement can be significant vs. shared spindles in a non-SAN significant vs. shared spindles in a non-SAN environments, e.g. SCSI sub-systems with limited environments, e.g. SCSI sub-systems with limited number of spindles.number of spindles.
Storage Area NetworkStorage Area Network
SAN AdvantageSAN Advantage Large cache and large number of spindles Large cache and large number of spindles
often simplify the storage architecture often simplify the storage architecture designdesign
High availability, reliability and often rich High availability, reliability and often rich feature set for VLDB environmentfeature set for VLDB environment
SAN Watch OutsSAN Watch Outs Storage design is still necessary, it’s not the Storage design is still necessary, it’s not the
solution for everythingsolution for everything Performance information may not be easily Performance information may not be easily
accessibleaccessible Investment is skilled SAN Administration is a Investment is skilled SAN Administration is a
requirementrequirement
I/O MonitoringI/O Monitoring
Performance monitor countersPerformance monitor counters Processor & SQL Server Process: %Privileged Time vs. % Processor & SQL Server Process: %Privileged Time vs. %
Processor TimeProcessor Time PhysicalDisk: % Disk Time, Avg Disk Queue Length, Current PhysicalDisk: % Disk Time, Avg Disk Queue Length, Current
Disk Queue Length, Avg. Disk Sec/Read, Avg. Disk Disk Queue Length, Avg. Disk Sec/Read, Avg. Disk Sec/WriteSec/Write
Avoid logging to disks being monitoredAvoid logging to disks being monitored
Vendor ToolsVendor Tools SAN environment often do not provide the accurate SAN environment often do not provide the accurate
statistics to the OS. Use vendor monitoring tools to obtain statistics to the OS. Use vendor monitoring tools to obtain the comparable information.the comparable information.
SQL Server Datafile I/O statisticsSQL Server Datafile I/O statistics Transaction log bottleneck - Query virtual Transaction log bottleneck - Query virtual
table ::fn_virtualfilestatstable ::fn_virtualfilestats Select * from ::fn_virtualfilestats(<database id>, -1)Select * from ::fn_virtualfilestats(<database id>, -1)
if IOStallMS / (NumberReads + NumberWrites) >= 20ms, then a if IOStallMS / (NumberReads + NumberWrites) >= 20ms, then a log I/O bottleneck may exist.log I/O bottleneck may exist.
Enterprise ClassEnterprise ClassSystems InfrastructureSystems Infrastructure
Enterprise ClassEnterprise ClassSystems InfrastructureSystems Infrastructure
Enterprise Wintel ArchitecturesEnterprise Wintel Architectures
Enterprise Storage ArchitecturesEnterprise Storage Architectures
Microsoft Systems ArchitectureMicrosoft Systems Architecture
CacheCacheBestBest
OKOK
Basic Computing ArchitectureBasic Computing Architecture
Intel’s Server SolutionIntel’s Server Solution
Profusion ArchitectureProfusion Architecture
Current Intel Xeon Processor MP speedsCurrent Intel Xeon Processor MP speeds 2GHz, 1.90GHz, 1.60GHz, 1.50GHz, 1.40GHz 2GHz, 1.90GHz, 1.60GHz, 1.50GHz, 1.40GHz
Current supported Profusion productsCurrent supported Profusion products 700 MHz, 900 MHz 700 MHz, 900 MHz
Stops at 8 processorsStops at 8 processors
A High End ArchitectureA High End Architecture
I/OI/OI/OI/O
MSUMSU
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSU MSUMSU
I/OI/OI/OI/O
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
FLCFLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSUMSUMSU MSUMSU
ES7000 Scaling Up – all 32 CPUsES7000 Scaling Up – all 32 CPUs
I/OI/OI/OI/O
MSUMSU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSU MSUMSU
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSUMSUMSU MSUMSU
1111 2222 3333 4444 5555 6666 7777 8888
ES7000 PartitioningES7000 Partitioning
I/OI/OI/OI/O
MSUMSU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSU MSUMSU
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
I/OI/OI/OI/O
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
TLCTLC
CPUCPU
CPUCPU
CPUCPU
CPUCPU
Crossbar Crossbar Intra-connectIntra-connect
MSUMSU MSUMSUMSUMSU MSUMSU
1111 2222 3333 4444
ES7000 – Common Usage ModelES7000 – Common Usage Model
Unisys ES7000 – 32 bitUnisys ES7000 – 32 bit
Single-domain 16-way
High Performance Crossbar
IntelProc
IntelProc
IntelProc
IntelProc
4th level Cache12 PCI
66mhz slots
IntelProc
IntelProc
IntelProc
IntelProc
4th level Cache
IntelProc
IntelProc
IntelProc
IntelProc
4th level Cache
IntelProc
IntelProc
IntelProc
IntelProc
4th level Cache
High Performance Crossbar
Centralized Memory
12 PCI 66mhz slots
12 PCI 66 66mhz slots
12 PCI 66mhz slots
Dual-domain 32-way
Shared Cache Shared Cache
Shared Cache Shared Cache
Unisys ES7000 - 32 bitUnisys ES7000 - 32 bit
Highest performer of Xeon MP based ES7000sHighest performer of Xeon MP based ES7000s
Best availability and manageability features w/ Best availability and manageability features w/ Server SentinelServer Sentinel
For large databases and mission-critical For large databases and mission-critical applicationsapplications
Can be configured from 8 to 32 Xeon Can be configured from 8 to 32 Xeon processors MP with up to 8 partitionsprocessors MP with up to 8 partitions
More than twice the I/O bandwidth of the Orion More than twice the I/O bandwidth of the Orion 200 model200 model - Up to 64 x 66 MHz and 32 x 33 MHz PCI slots- Up to 64 x 66 MHz and 32 x 33 MHz PCI slots
Unisys ES7000 – 32 bitUnisys ES7000 – 32 bit
Up to 8x cellular sub-podsUp to 8x cellular sub-pods - 4-way Intel Xeon processors MP - 4-way Intel Xeon processors MP
(1.4 GHz or 1.6 GHz, 64-bit bus at 400 MHz FSB)(1.4 GHz or 1.6 GHz, 64-bit bus at 400 MHz FSB)
- 32MB module of shared cache- 32MB module of shared cache
Up to 4 crossbarsUp to 4 crossbars
- High-performance, non-blocking - High-performance, non-blocking
Up to 64GB of centralized memoryUp to 64GB of centralized memory
- Contains cache coherency mechanism- Contains cache coherency mechanism
Unisys ES7000 – 64 bitUnisys ES7000 – 64 bit
Enterprise Wintel ArchitectureEnterprise Wintel Architecture
Windows Datacenter ProgramWindows Datacenter Program High end systemsHigh end systems
32 Processor32 Processor 64GB Memory64GB Memory
Higher level of supportHigher level of support Joint support queueJoint support queue
Unisys Server SentinelUnisys Server Sentinel System health monitoring & Self-healingSystem health monitoring & Self-healing Automated operations & Problem reportingAutomated operations & Problem reporting Hardware and software inventoryHardware and software inventory Remote managementRemote management
Unisys Application Sentinel (SQL Server)Unisys Application Sentinel (SQL Server) Self-Optimization for SQL Server environmentsSelf-Optimization for SQL Server environments
Analyzes metrics and compares to known good baselineAnalyzes metrics and compares to known good baseline Chooses setting/resource changes to relieve bottlenecksChooses setting/resource changes to relieve bottlenecks
Self-Healing for SQL Server environmentsSelf-Healing for SQL Server environments Predicts halt/hang condition and triggers graceful failoverPredicts halt/hang condition and triggers graceful failover
Configurable automation levelConfigurable automation level
Pilgrim’s PridePilgrim’s Pride
$2.2B poultry producer located in Pittsburg, Texas, $2.2B poultry producer located in Pittsburg, Texas,
3rd largest in US. Its 2 TB database is the largest SAP user 3rd largest in US. Its 2 TB database is the largest SAP user
in the world running SQL Serverin the world running SQL Server
Business Problem:Business Problem:
Required additional reliable, scalable capacity for a database server that was running out of Required additional reliable, scalable capacity for a database server that was running out of headroom on a commodity 8-way server. Current number of users is 2600 and is expected to headroom on a commodity 8-way server. Current number of users is 2600 and is expected to double by 2004 through acquisition and regular business expansion.double by 2004 through acquisition and regular business expansion.
Solution:Solution:
Two 16-way ES7000s running Windows 2000 Datacenter Server and SQL Server 2000, each Two 16-way ES7000s running Windows 2000 Datacenter Server and SQL Server 2000, each with 32GB of memory. Each system is configured with a 12-way DB server and a 4-way with 32GB of memory. Each system is configured with a 12-way DB server and a 4-way application server clustered to failover to the other.application server clustered to failover to the other.
Enterprise ClassEnterprise ClassSystems InfrastructureSystems Infrastructure
Enterprise Wintel ArchitecturesEnterprise Wintel Architectures
Enterprise Storage ArchitecturesEnterprise Storage Architectures
Microsoft Systems ArchitectureMicrosoft Systems Architecture
High-End Storage ArraysHigh-End Storage ArraysArchitecture—Design Alternatives
Advantages:– Good performance– Simple, low cost
Challenges:– Bus contention– Limited scaling
Bus1.6 GB/s
Switch 10.6+5.3 = 15.9 GB/s
Yesterday
Advantages:– Good performance
Challenges:– Switch contention– Limited scaling– Complex, costly
High-End Storage ArraysHigh-End Storage ArraysArchitecture—Design Alternatives
Advantages:– Good performance– Simple, low cost
Challenges:– Bus contention– Limited scaling
Advantages:– Good performance
Challenges:– Switch contention– Limited scaling– Complex, costly
YesterdayBus
1.6 GB/s
Switch 10.6+5.3 = 15.9 GB/s
TodayDirect Matrix
64+6.4 = 70.4 GB/s
Breakthrough design:– Simple, cost-effective– More reliable– More scalable– Future-proof growth
Symmetrix Direct MatrixSymmetrix Direct MatrixArchitecture— Interconnect and Cache Architecture
Matrix InterconnectMatrix Interconnect
PathingPathing 128 direct point-128 direct point-
to-point dedicated connectionsto-point dedicated connections Matrix backplaneMatrix backplane
Dynamic Global CacheDynamic Global Cache
ParallelismParallelism 32 independent regions32 independent regions 32 concurrent IOs32 concurrent IOs
MoreMorePerformancePerformance
MoreMoreAvailabilityAvailability
MoreMoreFunctionalityFunctionality
BetterBetterEconomicsEconomics
• More IOPs and bandwidth• More burst performance• More business continuity
performance
• Simpler design• Eliminates bus limitations• Eliminates switch limitations
• Full software compatibility• Assured network
interoperability
• Scale high / scale low• More efficient use of hardware• Precisely target configurations
Symmetrix Direct MatrixSymmetrix Direct Matrix
Symmetrix DMX Direct Matrix Symmetrix DMX Direct Matrix InterconnectInterconnect
• Up to Up to 128 direct paths 128 direct paths from Directors and cachefrom Directors and cache• 500 MB/s500 MB/s per direct path per direct path• Up to Up to 64 GB/s64 GB/s maximum rated data bandwidth maximum rated data bandwidth• Up to Up to 6.4 GB/s6.4 GB/s message bandwidth message bandwidth
Dynamic Global CacheDynamic Global Cache
• Up to Up to 32 concurrent IOs32 concurrent IOs through cache through cache• 500 MB/s500 MB/s per cache region per cache region• Up to Up to 16 GB/s16 GB/s maximum cache bandwidth maximum cache bandwidth• Architected for up to Architected for up to 512 GB512 GB Global Memory Global Memory
500 MHz PowerPC500 MHz PowerPCPowerplantPowerplant
• Up to Up to 116 processors116 processors• Up to Up to 106,000106,000 Dhrystone MIPS Dhrystone MIPS
2 Gb Fibre Channel drive 2 Gb Fibre Channel drive infrastructureinfrastructure
• Up to Up to 64 drive loops64 drive loops• Up to Up to 12.8 GB12.8 GB back-end I/O bandwidth back-end I/O bandwidth• Architected to Architected to 2,048 disk drives2,048 disk drives
Up to 12 high-density connectivity Up to 12 high-density connectivity directorsdirectors
• 8-port, 2 Gb Fibre Channel Directors8-port, 2 Gb Fibre Channel Directors— 16 to 9616 to 96 direct connections direct connections— Over Over 8,0008,000 network-connected hosts network-connected hosts
Architectural Specifications
Synchronous Mirroring Synchronous Mirroring
Most Durable Form of Remote MirroringMost Durable Form of Remote Mirroring
Confirmation Returned as Part of I/O Confirmation Returned as Part of I/O CompletionCompletion
Synchronous vs. Asynchronous, and Synchronous vs. Asynchronous, and Adaptive ModeAdaptive Mode
EMC SRDFEMC SRDF Architecture uses delta technology (Track Tables)Architecture uses delta technology (Track Tables)
Fully independent of host operating systems, DBMS, Fully independent of host operating systems, DBMS, filesystemsfilesystems
Data is mirrored remotelyData is mirrored remotely
Bi-directional source-to-target(s) architectureBi-directional source-to-target(s) architecture
Mirrored volumes logically synchronizedMirrored volumes logically synchronized
Added protection against drive, link, and server failuresAdded protection against drive, link, and server failures
Source Target
SRDF links
SRDF Modes of OperationSRDF Modes of OperationSynchronous ModeSynchronous Mode
3
2
41
Source Target
SRDF links
4Ending status is presented to host / serverEnding status is presented to host / server
3 Receipt acknowledgment is provided by target back Receipt acknowledgment is provided by target back to cache of sourceto cache of source
2I/O is transmitted to the cache of the targetI/O is transmitted to the cache of the target
1I/O write received from host / server into cache of sourceI/O write received from host / server into cache of source
SRDF and ClusteringSRDF and Clustering
GeoSpanGeoSpan for MSCS for MSCS
Provides cluster Provides cluster support across a support across a geographygeography
“Failover” Location
Primary Location
SRDF
Private Interconnect
Heartbeat Connector
Old Mutual Old Mutual Life Assurance CompanyLife Assurance Company
Old Mutual, a Fortune Global 500 company, is a world-class international financial services company that manages more than $234 billion in funds for millions of customers worldwide. Old Mutual has been aggressively expanding its operations in life assurance, asset management, banking and short-term general insurance.
Business Challenge:Business Challenge:
Promote business continuity by streamlining an overgrown server Promote business continuity by streamlining an overgrown server environment and providing total disaster recovery and backup to prevent environment and providing total disaster recovery and backup to prevent downtime and service interruptions.downtime and service interruptions.
Solution:Solution:
Unisys ES7000 servers, Microsoft Windows 2000 Advanced Server and SQL Unisys ES7000 servers, Microsoft Windows 2000 Advanced Server and SQL Server 2000, and EMC Symmetrix, SRDF, GeoSpan, and PowerPath products.Server 2000, and EMC Symmetrix, SRDF, GeoSpan, and PowerPath products. Outage time went from 54 hours per month to near zero.Outage time went from 54 hours per month to near zero.
Using TimeFinder for BackupsUsing TimeFinder for Backups
Quiesce SQL ServerQuiesce SQL Server
Ensure All Updates Ensure All Updates Completed to Completed to Production Disk Production Disk VolumeVolume
Split Off a Point-In-Split Off a Point-In-Time BCV Mirror of Time BCV Mirror of Logical VolumeLogical Volume
Resume Production Resume Production Application Application Processing While …Processing While …
Performing Backups Performing Backups from the BCV Mirrorfrom the BCV Mirror
TAPETAPESYSTEMSYSTEM
ProductionProductionProductionProduction
BCV for BCV for BackupBackupBCV for BCV for BackupBackup
Storage ManagementStorage ManagementFunctionality—Storage Management Made Simple, Automated, Open
Replication and Recovery:TimeFinder, SRDF, Consistency Groups
Performance Management:ControlCenter Symmetrix Optimizer, Workload Analyzer
Device Management:ControlCenter Symmetrix Manager, SRDF/TimeFinder Manager, Symmetrix Data Mobility Manager (SDMM)
Intelligent Supervision: ControlCenter StorageScope, Database Tuner, Automated Resource Manager, SAN Manager, Common Array Manager; EMCLink
Information Safety: EMC Data Manager (EDM),ControlCenter Replication Manager,ISV backup applications
Infrastructure Services: PowerPath, Celerra HighRoad, GeoSpan
Legend:Multivendor storage managementEMC platform storage management
Enterprise ClassEnterprise ClassSystems InfrastructureSystems Infrastructure
Enterprise Wintel ArchitecturesEnterprise Wintel Architectures
Enterprise Storage ArchitecturesEnterprise Storage Architectures
Microsoft Systems ArchitectureMicrosoft Systems Architecture
The Unisys Microsoft Systems Architecture is a The Unisys Microsoft Systems Architecture is a best-of-breed, Windows-based infrastructure best-of-breed, Windows-based infrastructure
producing faster implementations with predictable producing faster implementations with predictable costs, reduced risk and faster time to benefit.costs, reduced risk and faster time to benefit.
Fully redundant data Fully redundant data centers supporting centers supporting
“Unisys-unique” “Unisys-unique” capabilitiescapabilities
Disaster recoveryDisaster recovery Enhanced systems Enhanced systems
management management Secure application tierSecure application tier Large OLTP data basesLarge OLTP data bases Multi-terabyte business Multi-terabyte business
intelligence hostingintelligence hosting Server consolidationServer consolidation
Unisys MSA EDC ProgramUnisys MSA EDC Program
Site A - EMC Lab Site B - Unisys Lab
A Summary of Today’s SessionA Summary of Today’s Session SQL Server 2000SQL Server 2000
SQL Server Failover ClusteringSQL Server Failover Clustering Resource managementResource management
MemoryMemory CPUsCPUs
IndexingIndexing I/O and StrategiesI/O and Strategies
Enterprise Class System InfrastructureEnterprise Class System Infrastructure Enterprise Class ServersEnterprise Class Servers
Unisys ES7000Unisys ES7000
Enterprise Class StorageEnterprise Class Storage EMC SymmetrixEMC Symmetrix
Unisys Services and MSAUnisys Services and MSA Infrastructure engineered for your most demanding Infrastructure engineered for your most demanding
requirementsrequirements
Thank You!Thank You!