sql server 2008 r2 data protection using dpm 2010 and dell...
TRANSCRIPT
A Dell Technical White Paper
Dell │ Database Solutions Engineering
Leena Basanthi
SQL Server 2008 R2 Data Protection Using DPM 2010 and Dell EqualLogic ™ PS Series Arrays
2
Executive Summary
In today’s organizations, most of the mission-critical systems have all of their data housed in Database Management Systems (DBMS). One of the most popular relational databases on Microsoft® Windows® is Microsoft SQL Server™, which serves as a critical component for many of the business intelligence and e-business applications. If a database server crashes, due to planned or unplanned events such as mistakes, viruses, or database corruption, a business could face downtime resulting in lost revenue, productivity, and customers. Organizations should make sure to design and implement an effective business continuity plan for the success and sustenance of their businesses. A rapid and reliable data protection strategy is needed for availability of system critical applications whose data is housed in SQL server databases. Microsoft System Center Data Protection Manager 2010 (DPM 2010) in conjunction with Dell™ EqualLogic™ storage systems, brings you a complete data protection solution for Microsoft SQL Server 2008 R2 that also enables you to have a scalable and easy to manage storage consolidation for enterprises of all sizes. DPM 2010 is a disk-based data protection software application that enhances reliability and provides fast backup windows. This white paper provides best practices for using DPM 2010 for SQL Server 2008 R2 protection combined with Dell EqualLogic storage. The best practices outlined in this paper help customers to ensure a rapid and reliable recovery of a database and eliminate problems faced with traditional backup windows.
THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL
ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR
IMPLIED WARRANTIES OF ANY KIND.
© 2010 Dell Inc. All rights reserved. Reproduction of this material in any manner whatsoever without
the express written permission of Dell Inc. is strictly forbidden. For more information, contact Dell.
Dell, the DELL logo, and the DELL badge, EqualLogic, PowerEdge, and PowerConnect are trademarks of
Dell Inc. Microsoft, Windows, SQL Server, and Windows Server are registered trademarks of Microsoft
Corporation in the United States and/or other countries. Intel is a registered trademark of Intel
Corporation in the U.S. and/or other countries.
Other trademarks and trade names may be used in this document to refer to either the entities
claiming the marks and names or their products. Dell disclaims proprietary interest in the marks and
names of others.
3
Table of Contents
Executive Summary ....................................................................................................... 2
Overview of System Center Data Protection Manager 2010 ..................................................... 5
Using DPM 2010 With PS Storage for SQL Server Database Protection .......................................... 6
DPM 2010 With PS Storage ............................................................................................ 6
DPM 2010 With SQL Server 2008 R2 ................................................................................. 7
How DPM 2010 Capabilities Blend With PS Storage Features to Provide Near-Continuous Data
Protection for SQL Server 2008 R2 Database ...................................................................... 8
Reference Configuration ............................................................................................ 11
Conclusion ................................................................................................................ 19
References ................................................................................................................ 21
4
Introduction Businesses of all sizes often face challenges in protecting database servers that run business-critical online transaction processing (OLTP) and online analytical processing (OLAP). Microsoft SQL Server 2008 R2 is one such comprehensive data management and business intelligence platform that facilitates enterprises to run business-critical applications and to store relational and structured data. Because Service Level Agreements (SLAs) demand application availability online 24x7, many organizations cannot tolerate the impact of losing data stored in a database. To improve the availability of mission-critical applications, it is necessary to develop a comprehensive data protection strategy by considering Recovery Time Objective (RTO) and Recovery Point Objective (RPO)—the key metrics of an effective business continuity plan. An RTO is the amount of time it takes to recover from a disaster event, and an RPO is the acceptable amount of data that you can lose from disruption. The most common strategies for data protection that are driven by these metrics are backups made to tape and sent offsite, backups made to disk on site, replication of data, or high availability systems. Choosing an appropriate strategy that addresses timely data recovery is one of the challenges that IT administrators often face. To minimize the impact of database downtime and improve the business continuance scenarios, enterprises need to choose an optimal backup plan such as near-continuous data protection that supports instantaneous recovery to keep the RPO as small as possible. Microsoft System Center Data Protection Manager 2010 (DPM 2010) is a ―best-of-breed‖ product that offers effective and near-continuous data protection for Windows application servers, such as SQL Server, and that uses Microsoft’s shadow copy technology for continuous backups. DPM 2010 offers an RPO based on how often you are taking a recovery point. When DPM 2010 is used in conjunction with Dell EqualLogic storage and Dell PowerEdge™ servers, they work together to simplify deployment, improve performance, ensure seamless scalability, and provide optimal data protection for key business applications. The Dell EqualLogic PS6010 and PS6510 are the newest series of EqualLogic iSCSI SAN storage systems that add the power of 10GbE technology and provide higher bandwidth performance for mission-critical, high I/O database applications. This white paper focuses on how DPM 2010 works with PS storage 6010 and SQL Server 2008 R2, and on how they are better together in providing a complete business continuity solution.
5
Overview of System Center Data Protection Manager
2010 System Center Data Protection Manager 2010 (hereafter referenced as DPM 2010), an integrated member of the Microsoft System Center family of products, is designed to provide near-continuous data protection with a primary focus on servers running Microsoft workloads such as Microsoft SQL Server OLTP/OLAP databases. DPM 2010 enables administrators to implement real-time disk protection for SQL server databases and provides flexibility to both administrators and end users to restore files from backups and restore points. DPM 2010 seamlessly integrates with disk and tape, and it offers data protection methods such as: disk-based (D2D), tape-based (D2T), or a combination of disk-based and tape-based protection (D2D2T). DPM 2010 offers features such as its intelligent GUI for ease of management, strong Power Shell support, and a robust replication and checkpoint system that allows database administrators to perform backups and recoveries quickly and successfully. With the help of the DPM 2010 Administrator Console, end users get a consolidated interface that gives immediate access to the Monitoring, Protection, Recovery, Reporting, and Management task areas as shown in Figure 1.
Figure 1: DPM Administrator Console
6
As a part of the Microsoft Windows Server® System, DPM 2010 contains tools like Microsoft Management Console (MMC) to which all of the DPM 2010 reports and alerts are uploaded. DPM 2010 enhances the support with the Self-Service Recovery Tool (SSRT) that enables end users to recover SQL server databases that are backed up by the DPM 2010 server, without any intervention from the DPM 2010 administrator.
Using DPM 2010 With PS Storage for SQL Server
Database Protection DPM 2010 extends its powerful capabilities to deliver near continuous data protection for Microsoft SQL Server 2008 R2. The subsequent sections describe how DPM 2010 blends with Dell EqualLogic Storage and how it works to protect SQL Server 2008 R2. These topics are followed by a series of DPM 2010 features and best practices that are outlined to provide complete database protection.
DPM 2010 With PS Storage
For details about deploying and configuring DPM 2010 with PS storage systems, you can refer to the PS Series Best Practices Deploying DPM 2007 in iSCSI SAN guide from EqualLogic, which is applicable with DPM 2010. This section describes the benefits of using iSCSI SAN storage with DPM 2010, a server software application that provides near-continuous data protection and optimizes disk-based backup and recovery operations. Intelligent PS Series storage is an excellent choice for use as the storage pool resources in DPM 2010 configurations. Deploying DPM 2010 with a PS Series SAN, you can configure a completely redundant environment using Multipath I/O technology, which improves the reliability of the DPM 2010 server. In addition, with a PS Series SAN, you can boot the DPM 2010 server and the protected application servers from the SAN, resulting in a configuration that can support nearly immediate system recovery. PS series storage improves storage utilization efficiency and availability by allowing the administrators to logically over-allocate SAN volumes with features like thin provisioning, in which the current and future storage capacity needs are met, while eliminating the difficulty of adding storage to applications or operating systems that do not support online expansion. This feature enables the allocation of very large volumes on the backend DPM 2010 storage pool for the replication and recovery points that DPM 2010 recommends for the protection group. The storage advanced features such as Clone, Snapshot, Replication, and Performance Monitor can be incorporated into the existing solution at no extra cost to match ease-of-use with ease-of-doing-business. Cloning a volume creates a new volume with the same size and contents as the original volume, which helps to dramatically reduce network overhead during the creation of the initial replica for the protected database. Other features like Snapshot and Replication simplify and improve performance of backup and recovery operations. By replicating volumes from one group to another, you can set up a simple, yet robust disaster recovery plan for the primary group, so that you can promote a replica set to a new volume and snapshots—temporarily or permanently. As a supplement to the Performance Monitor feature of the PS Series array, you can also use the Multiple Router Traffic Grapher (MRTG) to monitor the I/O activity, latency, and throughput of volumes and members. To have consolidated performance and event monitoring across the virtualized Dell EqualLogic SAN, using the EqualLogic SAN HeadQuarters (SAN HQ) management tool is the best option. SAN HQ gives administrators the ability to monitor multiple PS Series groups, get historical
7
performance reports, and do trend analysis for capacity planning and troubleshooting from a central interface. With all the preceding features included in Dell EqualLogic PS series 6010, as shown in Figure 2, it serves as an ideal choice as a backend storage pool for DPM 2010 because it optimizes responsiveness of applications through high bandwidth and intelligent tiering of data within the array, and across low-latency Solid State Disk (SSD) and performance sensitive 15K RPM SAS hard disk. The PS 6010 supports high throughput with 10g bandwidth per port.
Figure 2. PS 6010
DPM 2010 With SQL Server 2008 R2 DPM 2010 starts its near-continuous data protection process through a protection group, which is a collection of data sources, and a storage pool that has one or more disks dedicated to store the replicas and recovery points. To fully protect the data source with DPM 2010, the following protection policies must be defined:
What type of data source is being protected?
Which protection method is to be used?
What is the amount of disk space to be allocated in the storage pool?
What is the method of replica creation for the members of the protection group? The DPM 2010 protection process involves creating and maintaining a full replica of the protected data and regularly creating recovery points of the synchronized replica. The replica serves as a full backup of the protected data. In addition, the DPM 2010 recovery process involves choosing a previous version of the data source from the stored recovery points on the DPM 2010 server and then restoring a copy of the selected data to its point of origin on the protected server. How DPM 2010 works to protect SQL Server- During the configuration of the protection group, the user selects the SQL Server either at database level or instance level as the data source to be protected. Each member of a protection group is associated with a replica, which is an initial baseline copy of the protected database. This replica serves as the full backup of the protected data. After a replica is created, DPM 2010 captures express full backups using the SQL Server VSS Writer, which identifies the blocks that have changed in the protected database. Only those blocks are sent to the DPM 2010 server for protection. To keep the replicas synchronized with the data on the protected computers, DPM 2010 incorporates a transactional replication feature that uses application-aware agents that can protect SQL server data through their transaction logs. Therefore, DPM 2010 uses a combination of transaction log replication and block-level synchronization in conjunction with the SQL Server VSS Writer to help ensure a rapid recovery of the SQL server databases. The DPM 2010 stores its replicas, shadow copies, and logs in the storage pool that consists of one or more disks.
8
How DPM 2010 Capabilities Blend With Dell Hardware (PS Storage 6010 and
Dell 11g Server-R910) to Provide Near-Continuous Data Protection for SQL
Server 2008 R2 Database?
Each of the following sections explains as how Dell hardware is suitable to protect SQL server 2008 R2 using DPM 2010. The sections describe DPM 2010 capabilities of data protection and recovery method such as data source and protection method selection, protection group configuration that includes recovery goals, disk allocation, replica creation method and consistency check etc., and how all these capabilities blend with EqualLogic features. Dell Power Edge
11g Server-R910: Because DPM 2010 and SQL Server 2008 R2 support scale-up scenarios for the largest available x64 hardware, therefore, deploying it on the Dell 11g PE R910 server is an ideal choice. With Intel® Advanced RAS Technology features, the PE R910 can automatically monitor, report, and recover from hardware errors to maintain data integrity and keep mission-critical services like OLTP/OLAP and backup/recovery applications online. The features, such as Performance resources, power efficiency, I/O, and memory scalability, enables maximizing database workload in a data center and using DPM 2010 servers more efficiently.
Enhancements in DPM 2010: As per Microsoft, DPM 2010 offers protection for up to 2000
databases per DPM 2010 server. DPM 2010 offers Instance-Level Protection with Data Source Collocation, so that all new databases within the instance are auto protected. With the help of the self-service recovery capability, Database administrators can now recover their databases without intervention from the backup administrator or without using the DPM 2010 administrator console. In addition to the preceding description of enhanced support, DPM 2010 automatically grows and shrinks volumes as required, which helps to use the storage more efficiently.
DPM 2010 Protection Methods: DPM provides tape-based and disk-based data protection which can be configured at the protection-group. DPM offers the following protection methods:
Disk-to-disk (D2D). This method is for a short-term protection, in which data from
protected computer is stored on the disk based storage such as DPM 2010 storage pool
or Custom Volumes. With this method, customers can have faster backup and recovery
times.
Disk-to-tape (D2T). This is a traditional method of backing up data. In DPM the data is
backed up from the replica in the storage pool to tape so that there is no network
traffic and no impact on the protected computer. You can also back up data directly
from the protected computer to tape. Enabling co-location of data on tape from
multiple protection groups with the same retention range maximizes tape utilization.
This method gives a means of security as the tapes are stored offsite.
Disk-to-disk-to-tape (D2D2T): This method combines D2D and D2T for extra
protection. Using DPM, data can be backed up on disk and tape through a combination
of full and incremental backups from either the protected computer or the DPM
replica. The D2D2T method provides the quick recovery benefits of disk-based storage
in the short term and tape-based storage in the long term. With DPM 2010 you can
create tape-based archive storage for critical data in long-term, from the disk-based
9
replica, with no impact on the computer. The archives that are created from the
replica combine full and incremental backups.
Disk Allocation: To configure a DPM 2010 backend storage pool, RAID 5 configuration offers an effective compromise between capacity, cost, reliability, and performance. However, because the DPM 2010 server workload is composed primarily of write operations, RAID 10 is best suited in terms of reliability, performance, and scalability. DPM 2010 strongly recommends using GPT disks for the DPM 2010 storage pool, which should be twice the size of the protected data for added protection. In DPM 2010, the data co-location feature enables protection of multiple data sources on a single volume or on the same tape, which allows you to store more data on each volume or tape. Disk collocation strongly improves scalability when protecting many small data sources and enables better DPM 2010 server utilization. To enable parallel replication and maximize performance in DPM 2010, users can choose Custom Volumes, which are not part of the DPM 2010 storage pool. But Custom Volume has the following limitations:
o Custom Volumes CANNOT be co-allocated. o Custom Volumes DO NOT support SQL Server Instance level backup.
With advanced EqualLogic thin provisioning, SQL server database volumes or the DPM 2010 storage pool can be preset to a specific size at volume creation. Physical storage is then automatically allocated from the free storage pool as database, replica, or recovery points volumes grow in size. The benefits are simpler management of volumes and more efficient use of physical storage resources.
Replica Volume and Recovery Volume: Each data source of a protection group is associated
with a Replica and Recovery Volume. The replica volume contains the replica for the protected data source, and the recovery volume contains the shadow copies and recovery points for a protected data source. These Volumes can be in either the DPM 2010 storage pool or in the Custom Volumes. The Size of these volumes depends on the size of the protected data, the daily recovery point size, the expected volume data growth rate, and the retention range objectives. When the disk space allocated for Replica Volumes and Recovery Point Volumes exceeds the threshold limit, DPM 2010 generates alerts accordingly for each of the volumes. To resolve this possible issue, users need to add the disks to the storage pool. With the EqualLogic family of iSCSI SANs, users can have a rapid SAN deployment, easy storage management, and seamless expansion. All EqualLogic PS Series arrays are completely interoperable, and they can be mixed and matched to build tiered storage within a single SAN or across SAN groups.
Replica Creation Method: It is a process by which the data sources selected for protection are transferred to the DPM 2010 server storage pool. DPM 2010 can create replicas either automatically over the network or manually. Automatic replica creation is easier, and it is a good option for the Instance level SQL Servers’ protection, for which the disk allocation is the DPM 2010 storage pool. Such large replica creation jobs must be scheduled to run during off-peak hours or when the network traffic is less. The manual replica creation option can be chosen depending on the size of the protected data and the speed of the network. If a critical database of large volume is to be protected, then using the combination of the EqualLogic Clone feature, Custom Volume, and Manual Replica is the best option. This process involves creating a clone of the large volume and assigning it to the DPM 2010 server as a replica volume of storage type Custom Volume and assigning another volume as the Recovery Point Volume. After selecting the Manual Replica option, it is important to copy the cloned content to the DPM 2010 replica folder path, for example: C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\SqlServerWriter\vol_1f447024-985a-4747-a153-
10
b4e06c0aa665\7c59f9ad-3846-44e2-85e7-f6b311bcac10\Full\G-Vol\ Failing to do so, results in the automatic replica creation that takes a long time. It is recommended to create a clone when the I/O load is minimal on the application. For more details, refer to the PS Series Best Practices Deploying DPM 2007 in iSCSI SAN guide from EqualLogic, which is applicable with DPM 2010.
Synchronization of Application Data: To provide continuous data protection to the
protected database after the creation of the initial replica, changes are transferred to the DPM 2010 server incrementally through synchronization at regular intervals. This process of synchronizing the replica depends on the type of application and the type of synchronization. DPM 2010 provides the following methods of synchronizing a replica:
Incremental synchronization Synchronization with consistency check
To choose the best suitable synchronization method for database protection, the user should be aware of the following guidelines: DPM 2010 does not support incremental synchronization for the following databases:
master, msdb, databases that are log-shipped and that is in read-only mode or that use the simple recovery model—they use Express Full backups and recovery points are created corresponding to each express full backup.
For all other SQL server databases, synchronization transfers a transaction log backup, and
recovery points are created for each incremental synchronization and express full backup. But synchronization frequency should be selected based on how much data loss you can tolerate.
Running a consistency check is another process of synchronization that involves checking
inconsistencies between the replicas and the protected data source. The time taken to complete this process depends on the discrepancies.
For heavy workload applications like SQL Server R2 databases, it is recommended to
perform consistency checks during off-peak hours because it affects the performance of the protected computer, the DPM 2010 server, and the network. However, it is expected to be optimized because only the changes and checksums are transferred during this process.
Network Throttling: The network traffic can be throttled by setting limits for maximum
network utilization to minimize the impact of the DPM 2010 server’s primary operations on the network. Throttling helps to ensure that network bandwidth is available to other applications and to limit the amount of network bandwidth that the DPM 2010 server can consume during operations like replica creation, synchronization, and consistency checks. However, users should bear in mind that the network throttling ability can lengthen the amount of time for these operations to complete. The network bandwidth usage throttling can be set in terms of a fixed amount of data to be transferred during the defined peak/non-peak hours. Unless the QoS is enabled on the network interface in the production server and the DPM 2010 server, the network throttling feature cannot be utilized.
SQL Server Database Recovery: DPM 2010 offers a rapid and reliable recovery of the SQL
server database by integrating a point-time data version with the existing application log, without the need for constant replication and synchronization.
DPM 2010 allows recovery of SQL server databases at the instance level or to a
network folder, and it also allows you to keep both the existing database and
11
recovered database (renamed) in the original instance of SQL Server with an option of the database state to be either operational or non-operational.
DPM 2010 offers an option to copy SQL transaction logs between the selected version
of the database and the latest version available for recovery. It will be a smooth recovery operation because any synchronization jobs for the selected recovery item are cancelled while the recovery is in progress.
With enhanced support in DPM 2010, the Self-Service Recovery Tool (SSRT) enables
end users to recover SQL server databases that are backed up by the DPM 2010 server, without any intervention from the DPM 2010 administrator.
For quicker recovery, users can go for SAN-based recovery that requires hardware snapshots. SAN based Recovery is advantageous for the recovery of huge volumes of SQL server databases since it allows faster and more reliable recovery with reduced network bandwidth, bypassing the LAN. This feature is valid only if the hardware snapshot functionality is enabled. In the Dell EqualLogic SAN solution, users can utilize the EqualLogic ASM to perform DPM 2010’s SAN-based recovery, which involves the process of taking the EqualLogic snapshot of the replica volume and the recovery point volume of the DPM 2010 server. For a detailed SAN-based recovery procedure, you can refer to the PS Series Best Practices Deploying DPM 2007 in iSCSI SAN guide, which is applicable for DPM 2010.
SAN HQ: SAN HeadQuarters provides centralized monitoring, historical performance trending,
and event reporting for multiple PS Series groups. Using SAN HeadQuarters, users can quickly be informed of hardware problems, improve performance by identifying performance bottlenecks, and effectively allocate group resources by identifying underutilized resources and monitoring data. SAN HeadQuarters does not disrupt access to group storage or degrade performance. Using SAN HeadQuarters, you can:
Become quickly informed of hardware or performance-related problems. Improve performance by identifying performance obstructions (bottlenecks). Allocate group resources more effectively by identifying underutilized resources. Enable multiple individuals to access and monitor the same performance data.
The following table gives a brief description of the SAN HQ features:
FEATURES DELL EQUALLOGIC SAN HEADQUARTERS
Monitoring and Analytics
Capacity, I/O performance, Network Data, Member Hardware and Configuration, Volumes, Replication, Maximum Theoretical IOPS, IOPS vs Latency, Disk Queue Depth
Reporting
Performance, Capacity Utilization and Trending
Group Configuration with Alerts, Replication Status, Top 10, Host Connections
Ease of Use
Single Sign-on for Launching EqualLogic Group Manager, Point-in-time data views, User defined data ranges, Summary views, Contextual Pop-up views
Reference Configuration
To establish DPM 2010 performance that matches the administrator’s expectation and addresses their
concerns, it is necessary to understand the impact of the DPM 2010’s server primary data operations on
12
the production database and on the DPM 2010 server, as well as to understand how these operations consume network resources affecting network performance.
Using the guidelines described in this white paper, the Dell| Database Solution Engineering Team constructed a reference configuration shown in Figure 3. The result and analysis showcase how the DPM 2010 server’s primary data transfer operations, such as initial replica creation, synchronization, and consistency checks, impact Production, backup systems, and network resources. The tests are carried out to determine how the performance of the replica creation will be limited by the speed of the network connection between the DPM 2010 server and the protected computers.
Figure 3. Reference Configuration
Test Case: The following test case scenarios are designed to verify the CPU utilization, Memory
availability, and network performance for replica creation operations and the appropriate disk
allocation method. In the case of Manual Replication, the disk allocation is a Custom Volume that
utilizes the EqualLogic Clone feature. For DPM 2010 server replication (or Automatic Replication), the
13
disk allocation is the DPM 2010 storage pool. The tests were carried out on 1g and 10g LAN networks
with an iSCSI network of 10g bandwidth.
Notes:
SQL_Server is the server that hosts the SQL Server 2008 R2 database
DPM_Server is the DPM 2010 server that initiates the primary data operations
Manual Replica is the replication that utilizes the EqualLogic Cloned volume as Custom
Volume
DPM replica is the Automatic replication where the DPM 2010 server does the replication
utilizing the DPM 2010 storage pool
CPU Utilization:
Figure 4: (A) - The following results are for a 1g LAN network with a storage bandwidth of
10g.
Figure 4: (B) - The following results are for a 10g LAN network with a storage bandwidth
of 10g.
0
5
10
15
20
25
1 8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
SQL_Server_Manual_Replica_(_Total)\% Processor Time
SQL_Server_DPM_Replica(_Total)\% Processor Time
DPM_Server_Manual_Replica(_Total)\% Processor Time
DPM_Server_DPM_Replica(_Total)\% Processor Time
RAID LAN Network
iSCSI Network
Replica Disk Allocation Protection at Database/Instance
10 1g/10g 10g DPM 2010 server
DPM 2010 storage pool
Database (Instance is applicable)
10 1g/10g 10g Manual Custom Database (Instance N/A)
14
Analysis: The % Processor Time is captured to measure the percentage of time that the processor was
busy during the initial replication for a Data Source of a 117GB SQL server database. It was captured
for both the methods of replication – Manual replication and DPM 2010 replication over a 1g/10g LAN
network as shown is Figure 4 (A) and (B). It is observed that in both cases, CPU utilization is a high
average of 20% in the SQL server when compared with the DPM 2010 server, which is an average of 3%.
The impact of the initial replication is more on the SQL server than on the DPM 2010 server. Therefore,
the initial replication of the database should be done when the I/O load is lower on the production
database server.
The value shown in the graph for the DPM 2010 server is minimal because it is for a data load of 117GB,
and no I/O is happening on the SQL server. But this value will significantly change on the DPM 2010
server in the following situations:
When parallel replication is enabled using more custom volumes.
When co-allocating more databases on the disk storage pool.
When multiple DPM 2010 jobs, like synchronization with consistency checks, are run
simultaneously.
As per Microsoft, if more than 95% of CPU usage is shown on the DPM 2010 server for more than
10 minutes, then it is preferred to offload the workload to another DPM 2010 server or to
stagger the synchronization schedule.
Figure 4 (C). CPU Utilization During Synchronization When I/O Is Running on SQL Server
0
5
10
15
20
25
1 8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
141
148
155
162
DPMSERVER_Manual_Replica(_Total)\% Processor Time
SQLSERVER_Manual_Replica(_Total)\% Processor Time
DPMSERVER_DPM_Replica(_Total)\% Processor Time
SQLSERVER_DPM_Replica(_Total)\% Processor Time
0
5
10
15
20
25
30
1
16
31
46
61
76
91
106
121
136
151
166
181
196
211
226
241
256
271
286
301
316
SQLSERVER\Processor(_Total)\% Processor Time
DPMSERVER\Processor(_Total)\% Processor Time
15
Analysis: After the initial replica creation, the Quest benchmark factory tool was used to run the I/O on the SQL server of user load 500. After some time, the synchronization with consistency check was performed from the DPM 2010 server. Later, the I/O was stopped on the client. As shown in the preceding Figure 4 (C), the peak rise that is shown in the graph, which went above 25% of CPU utilization, occurs when the I/O load was running on the SQL server and, in parallel, when the DPM 2010 server is synchronizing to check the inconsistencies. The results clearly indicate that the synchronization jobs are CPU intensive.
A few guidelines for synchronization of the SQL application:
The synchronization method should be chosen based on the ability to recover data. If you synchronize only once a day, the maximum loss window is 24 hours. However, if you choose to synchronize every hour, your maximum loss window is 1 hour.
Not all the databases support incremental synchronization. To enable faster recovery time, DPM 2010 will regularly perform an express full backup in place of incremental synchronization.
Because consistency checks affect the performance of both the DPM 2010 server and the protected computer, you should schedule consistency checks for hours when reduced responsiveness of the protected computer has the least impact on your business operations and there is the least amount of network traffic.
Memory Utilization:
Figure 5 (A). The following results are for a 1g LAN network with a storage bandwidth of
10g.
Figure 5 (B). The following results are for a 10g LAN network with a storage bandwidth of
10g.
0
1000
2000
3000
4000
5000
6000
7000
8000
1 9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
SQL_Server_Manual_Replica_Memory\Available MBytes
SQL_Server_DPM_Replica_Memory\Available MBytes
DPM_Server_Manual_Replica-Memory\Available MBytes
DPM_Server_DPM_Replica-Memory\Available MBytes
16
Analysis: The Memory Available/Mbytes is captured to measure the memory that is available to
processes running on the specified DPM 2010 server and to determine if the SQL server is impacted.
The Available/MBytes value is the sum of memory assigned to the standby (cached), free, and zero-
paged lists. From Figures 5 (A) and (B), it is observed that the value is >55 megabytes on average. The
DPM 2010 primary data operation job, such as replica creation, affects both memory and CPU usage.
For this test, we have used 8GB of RAM on both the DPM 2010 server and the SQL server. From the
preceding results, Figure 5 (A) is a 1g LAN network and Figure (B) is 10g LAN network. In both cases of
replication methods (Manual Replica or DPM 2010 Server Replica), the memory utilization is more on
the DPM 2010 server when compared to the SQL server. The processes on the DPM 2010 server that can
impact the performance are the DPM 2010 protection agent, the DPM 2010 service, the DPM 2010
Administrator Console, and the DPM 2010 instance of the SQL Server service. If the DPM 2010
Administrator Console (MMC.exe) is always opened, then it can be a significant factor in high memory
usage. Therefore, the console should be closed when it is not in use. Also, the DPM 2010 instance of
the SQL Server service uses a large amount of memory for caching; therefore, it is expected to be
relatively high. This memory usage does not indicate a problem because memory is released when
available memory is low.
Notes:
As per Microsoft, DPM 2010 requires a pagefile size that is 0.2 percent the size of all recovery
point volumes combined, in addition to the recommended size (generally, 1.5 times the
amount of RAM on the computer).
With Volume Shadow Copy Service (VSS) non-paged pool limitation on 32-bit operating systems,
therefore, running DPM 2010 server on a 64-bit operating system (Windows Server 2008 R2) is
an ideal choice that enables you to protect more than 10TB of data.
Network Performance
Figure 6 (A). With a 1g LAN Network for DPM (Automatic) Replica
0
1000
2000
3000
4000
5000
6000
7000
1 9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
DPMSERVER_DPM_Replica\Memory\Available MBytes
SQLSERVER_DPM_Replica\Memory\Available MBytes
DPMSERVER_Manual_Replica\Memory\Available MBytes
SQLSERVER_Manual_Replica\Memory\Available MBytes
17
Figure 6 (B). With a 1g LAN Network for Manual Replica
0
20000000
40000000
60000000
80000000
100000000
120000000
140000000
1 7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
SQLSERVER_DPM_Replica\Bytes Total/sec
DPMSERVER-DPM_Replica\Bytes Total/sec
0
200000
400000
600000
800000
1 7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
DPMSERVER_Manual_Replica\Bytes Total/sec
SQLSERVER_Manual_Replica\Bytes Total/sec
18
Figure 6 (C). With a 10g LAN Network for Manual Replica
Figure 6 (D). With a 10g LAN Network for DPM (Automatic) Replica
Analysis: The Network Interface Bytes Total/sec is captured to measure the throughput for the
network interface. This measurement is a combination of inbound and outbound traffic (Bytes Sent/sec
and Bytes Received/sec) of the network interface in Bytes. As shown in the preceding Figure 6 (A) and
(B) of 1g network bandwidth and in the preceding Figure 6 (C) and (D) of 10g network bandwidth, there
is a huge impact in data transfer during Manual Replica or DPM 2010 Replica (Automatic Replica)
creation. In our test validation, we have taken the 117GB database as the data source to be replicated
both manually and automatically. From the preceding results, the data transfer is quick and fast in
Manual Replica as compared to Automatic Replica. But for Manual Replica, we can choose replication
only at the database level and not at the instance level. And, to do manual replication for more than
one database, every database volume should be allocated a Custom Volume and cannot be co-allocated.
However, with Automatic (DPM 2010) Replica, this is not the case. From the preceding Figure 6(B) with
1g bandwidth, the impact of network performance on the SQL server is minimal, when compared to the
DPM 2010 server. Therefore, for large data sets and slow networks (<=1g), the Manual Replica option is
a better choice. And, the DPM 2010 Replica or the Automatic Replica is a good option when collocated
data sources, such as multiple database instances, are to be protected or when smaller sets of too
many databases need to be protected.
If there are more applications using the EqualLogic iSCSI SAN, then Network bandwidth usage throttling enables you to limit the amount of network bandwidth that DPM 2010 can use to create and synchronize replicas. However, network bandwidth usage throttling can lengthen the amount of time each synchronization job takes to complete.
0
200000400000
600000
8000001000000
1200000
14000001600000
1800000
1 9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
\\DPMSERVER_Custom_Vol\10g_Public\Bytes Total/sec
\\SQLSERVER_custom_vol\10g_Public)\Bytes Total/sec
0
20000000
40000000
60000000
80000000
100000000
1 9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145
153
161
DPMSERVER_DPM_Replica\Bytes Total/sec
SQLSERVER_DPM_Replica\Bytes Total/sec
19
A few guidelines to optimize performance:
Offset the start time of synchronization jobs across different protection groups so that they all do not start at the same time.
Run parallel backups of data sources in the same protection group to optimize your network usage.
Reduce the demand on the server's resources by scheduling fewer express full backups; however, doing so can increase data recovery time.
Conclusion Deploying DPM 2010 and Microsoft SQL Server R2 with Dell hardware (PS series 6010 and Dell PE R910)
delivers flexibility, ease-of-management, and scalability. Using a PS Series SAN provides benefits that
include:
Rapid configuration of the DPM 2010 server and the production database environment
Simple and immediate storage pool expansion with features like thin provisioning, flexibility for remote site volume replication, and DPM 2010 initial replica creation
SAN-based recovery solutions with the EqualLogic ASM feature DPM 2010 features seamless integration between disk and tape media and offers ease of management with its intelligent GUI. Co-allocating data sources on the DPM 2010 storage pool allows users to auto-protect multiple SQL server databases at the instance level, so that all new databases within the instance are automatically protected. Using Custom Volumes with the Manual Replica Option and utilizing the EqualLogic Clone feature maximizes the DPM 2010 server’s performance and enables parallel replication, but it restricts you from co-allocating multiple databases on the Custom Volume. To provide continuous data protection for SQL server databases, DPM 2010 performs express full backups and creates recovery points as per scheduled and supported synchronization methods. The inconsistencies between the protected database and replica can be verified by running a consistency check preferably during off-peak hours because it impacts network performance. However, throttling the network by setting maximum limits ensures that network bandwidth is available to other applications as well. With the Self-Service restore capability, SCPM provides a faster and reliable recovery. Based on the preceding reference configuration, the guidelines for disk allocation, Replica creation, and Synchronization are summarized in the following table:
Disk Allocation Initial Replica Creation Method Synchronization Method
Custom Volume
- Creating a Clone Volume using Clone feature as a replica volume.
- Cloning must be done when the I/O load is minimal on the production server.
- Recovery Point Volume must be created for each custom volume.
- Custom Volumes CANNOT be co-allocated.
- Custom Volumes DO NOT support SQL Server
Manual Replication - Can create a Replica from a tape backup or other removable storage media. - Can be done using Custom Volumes and Disk Storage Pool. -In this method, Data must be manually copied to the DPM 2010 Server Replica folder path. - Applicable for large Databases with slow networks (<=1g). -CPU Utilization: The impact of
-For Manual Replica method, the replica must be manually synchronized with a consistency check before scheduled synchronization and recovery point jobs begin. -Because running a consistency check impacts the performance of the DPM 2010 server and Production server, depending on the network load, CPU processing power, and timing, the manual consistency check should be
20
Instance level protection. - Enables parallel backups of
data sources that can optimize network usage.
manual Replica is more on SQL server than on DPM server. -Memory Utilization: DPM memory utilization is high than Production Server -Network Utilization: In case of Manual Replica, the impact on LAN network is less
performed only when it is necessary to make the replica consistent with its data source (database).
Disk Storage Pool
- PS storage enables using
thin provisioned volumes as a backend DPM 2010 storage pool.
- Should be a Dynamic disk of twice the size of the protected Data Source (databases).
- Provides instance level protection.
- DPM 2010 adjusts its recommendations for disk allocation to store the replicas and recovery points.
Automatic (DPM) Replication Allow the DPM 2010 server to copy the data from the protected server over the network. This method requires no intervention, but it can take several hours depending on network bandwidth and data size. Good option when collocated data sources, such as multiple database instances, are to be protected or smaller sets of too many databases need to be protected. - CPU Utilization: The impact of manual Replica is more on SQL server than on DPM server. -Memory Utilization: DPM memory utilization is high than Production Server -Network Utilization: In Case of Automatic Replica, the impact on LAN Network is more
For Automatic Replica method, the replica can be synchronized based on scheduled synchronization (incremental). This method is faster than performing a consistency check. But not all databases support incremental synchronization.* An automatic consistency check can be scheduled as a daily task, but because of the processing load impact on the DPM 2010 server and on the Production server, it is scheduled preferably when the network traffic is low.
* Note: DPM 2010 does not support incremental synchronization for master, msdb, and databases that are read-only mode or that use a simple recovery model. They use the Express Full backup. For all other SQL server databases, synchronization transfers a transaction log backup, and recovery points are created for each, incremental synchronization and express full backup. Businesses of all sizes can rely on these best practices to configure Microsoft System Center Data
Protection Manager 2010 to deliver a unified and scalable data protection solution that is cost
effective, secure, and reliable for Windows servers, such as SQL servers.
21
References
1. http://www.microsoft.com/systemcenter/en/us/data-protection-manager.aspx
2. http://technet.microsoft.com/en-us/library/ff399519.aspx
3. http://www.equallogic.com/resourcecenter/assetview.aspx?id=5249