jetstress 2010 · web view2012/03/27 · jetstress 2010 jetstress field guide tuesday, 27 march...

77
Template Version October 2011 Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by [email protected]

Upload: tranliem

Post on 05-May-2018

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

Template Version October 2011

Jetstress 2010Jetstress Field Guide

Tuesday, 27 March 2012Version 1.0.0.16 [Issued]

Prepared by

[email protected]

Page 2: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.

Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, our provision of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property.

The descriptions of other companies’ products in this document, if any, are provided only as a convenience to you. Any such references should not be considered an endorsement or support by Microsoft. Microsoft cannot guarantee their accuracy, and the products may change over time. Also, the descriptions are intended as brief highlights to aid understanding, rather than as thorough coverage. For authoritative descriptions of these products, please consult their respective manufacturers.

© 2011 Microsoft Corporation. All rights reserved. Any use or distribution of these materials without express authorization of Microsoft Corp. is strictly prohibited.

Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

The names of actual companies and products mentioned herein may be the trademarks of their respective owners.

Page iiJetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 3: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Revision and Signoff Sheet

Change Record

Date Author Version Change reference

19/08/2010

Neil Johnson 0.1.0.7 First draft sent out for review

26/08/2010

Neil Johnson 0.1.0.8 Updates from MCS UK Messaging team

27/08/2010

Neil Johnson 0.1.0.9 Updates from Alex Costa Added in lab test data

29/08/2010

Neil Johnson 0.1.0.10 UK Messaging team feedback

12/09/2010

Neil Johnson 0.1.0.12 Ross Smith IV review feedback Robert Gillies review feedback

13/09/2010

Neil Johnson 1.0.0.0 Released

16/09/2010

Neil Johnson 1.0.0.1 Jeff Mealiffe review feedback Updated Jetstress test types 6.1

20/09/2010

Neil Johnson 1.0.0.2 Scott Scholl review feedback

29/09/2010

Ross Smith IV 1.0.0.3 Incorporated troubleshooting and cmdline appendix sections

29/09/2010

Ramon b. Infante

1.0.0.4 Review and Update on Log Replication

11/10/2010

Neil Johnson 1.0.0.5 Incorporated Ramon b. Infante’s comments Updated Table 8 - Quick results analysis table

15/11/2010

Neil Johnson 1.0.0.6 Updated for version 14.01.0225.017

12/04/201 Neil Johnson 1.0.0.7 Included guidance for Exchange 2003

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 4: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

1 General corrections and layout Added more troubleshooting information Instructions for testing a production server

13/04/2011

Neil Johnson 1.0.0.8 Updates from community feedback Updated CRC Checksum section

13/04/2011

Neil Johnson 1.0.0.9 Included Raid testing guidance Updated test process for shared storage

26/04/2011

Neil Johnson 1.0.0.10 Added Appendix F on Background Database Maintenance

27/04/2011

Neil Johnson 1.0.0.11 Consolidated feedback for release.

18/05/2011

Neil Johnson 1.0.0.12 Clarified section 8 to make it clearer what values need to be compared to the mailbox role calculator.

09/06/2011

Neil Johnson 1.0.0.13 Updated section 8 with better report data

2/11/2011 Neil Johnson 1.0.0.14 Included fault finding feedback for mount point errors from Malvin M. Seale [15.2.5]

Updated Appendix A. setting thread count section to improve clarity.

4/03/2012 Neil Johnson 1.0.0.15 Added section [14] on BDM plus a link to Ross Smith IV’s post on Database Maintenance

27/03/2012

Neil Johnson 1.0.0.16 Updated document template Added section on Jetstress testing virtual machines

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 5: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Document Contributors

Name Position Section

Neil Johnson Senior Consultant, UK MCS Author

Alexandre Costa SENIOR SDET, Exchange Test Jetstress internals

Ross Smith IV PRINCIPAL PROGRAM MANAGER, Exchange CXP Configuring Jetstress

Ramon b. Infante DIR, WW COMMUNITIES, UC Various

Matt Gossage PRINCIPAL PROGRAM MANAGER LEAD Various

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 6: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Reviewers

Name Version Position Date

Doug Gowans 0.1.0.7 Senior Consultant, UK MCS 13/08/2010

Michael Currie 0.1.0.7 Senior Consultant, UK MCS 18/08/2010

Alexandre Costa 0.1.0.7 SENIOR SDET, Exchange Test 19/08/2010

Neil Hobson 0.1.0.9 Infrastructure Consultant, UK MCS 27/08/2010

Ross Smith IV 0.1.0.10 PRINCIPAL PROGRAM MANAGER, Exchange CXP 10/09/2010

Robert Gillies 0.1.0.10 SOLUTION ARCHITECT, US-US-MCS Federal SL 1 10/09/2010

Jeff Mealiffe 1.0.0.0 SENIOR PROGRAM MANAGER, Exchange CXP 16/09/2010

Scott Schnoll 1.0.0.1 PRINCIPAL TECHNICAL WRITER, Office Content Publishing (UA)

20/09/2010

Ramon b. Infante

1.0.0.4 DIR, WW COMMUNITIES, UC 29/09/2010

Internal community review

1.0.0.7 Various 13/04/2011

Internal community review

1.0.0.11 Various 27/04/2011

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 7: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Table of Contents

1 Purpose........................................................................................................................1

2 Introduction to Jetstress...............................................................................................2

3 Jetstress Internals........................................................................................................3

3.1 Main Jetstress Components......................................................................................................3

3.1.1 Auto Tuning Component.................................................................................................................3

3.1.2 Thread Dispatcher...........................................................................................................................4

3.1.3 Background Log Checksummer.......................................................................................................4

3.1.4 Offline Log and Database Checksummer.........................................................................................4

3.1.5 Reporting and Verification..............................................................................................................5

4 Planning for Jetstress...................................................................................................6

4.1 Jetstress testing flow chart........................................................................................................6

4.2 When should I run Jetstress in my project?..............................................................................7

4.3 Where should I run Jetstress in my infrastructure?...................................................................8

4.4 Testing Raid Arrays....................................................................................................................9

4.4.1 Example of a failed degraded mode test.......................................................................................10

4.5 Jetstress testing inside virtual machines.................................................................................11

4.5.1 What’s different about Jetstress inside a virtual machine?...........................................................11

4.6 How much time should I allocate for Jetstress testing?..........................................................12

4.6.1 Initialisation...................................................................................................................................12

4.6.2 Testing...........................................................................................................................................12

4.6.3 Clean-up........................................................................................................................................12

4.7 Preparing for the Jetstress test...............................................................................................14

4.8 What happens if the test fails?................................................................................................15

5 Installing Jetstress......................................................................................................16

5.1 Online Documentation............................................................................................................16

5.2 Jetstress Version and Download.............................................................................................16

5.3 Prerequisites...........................................................................................................................16

5.4 Getting ESE Files necessary for Jetstress.................................................................................17

5.4.1 File locations from an installed Exchange Server...........................................................................17

5.4.2 File locations from the installation media.....................................................................................17

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 8: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

5.5 Installation...............................................................................................................................18

5.5.1 Application Installation..................................................................................................................18

5.5.2 ESE File Installation........................................................................................................................20

6 Configuring Jetstress..................................................................................................23

6.1 Jetstress Test Types.................................................................................................................23

6.1.1 Test a disk subsystem throughput.................................................................................................23

6.1.2 Test an Exchange mailbox profile..................................................................................................23

6.2 Initial configuration.................................................................................................................24

7 Jetstress Output Files.................................................................................................30

8 Reading Jetstress report data.....................................................................................31

8.1 Target design values................................................................................................................31

8.2 Reading the Jetstress Test Result Report................................................................................31

8.2.1 Test Summary................................................................................................................................31

8.2.2 Database Sizing and Throughput...................................................................................................32

8.2.3 Jetstress System Parameters.........................................................................................................32

8.2.4 Database Configuration.................................................................................................................33

8.2.5 Transactional I/O Performance.....................................................................................................33

8.2.6 Background Database Maintenance I/O Performance..................................................................33

8.2.7 Log Replication I/O Performance...................................................................................................34

8.2.8 Total I/O Performance...................................................................................................................35

8.2.9 Host System Performance.............................................................................................................36

8.2.10 Test Log.........................................................................................................................................37

8.3 Interpreting Jetstress test result.............................................................................................38

8.4 Test evaluation........................................................................................................................39

9 Appendix A – Configuring thread count......................................................................40

10 Appendix B – Configuring sluggishsessions............................................................41

10.1 Lab test data for SluggishSessions...........................................................................................42

11 Appendix C - Running a Jetstress Test with JetstressCmd.exe................................43

12 Appendix D – Exchange 2003.................................................................................45

13 Appendix E – Running Jetstress on a production server.........................................46

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 9: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

14 Appendix F – Exchange 2010 BDM.........................................................................47

14.1 What is Background Database Maintenance anyway and why do I need it?..........................47

14.2 So why does it cause problems for some types of storage?....................................................48

14.2.1 iSCSI attached storage...................................................................................................................48

14.2.2 Incorrectly configured storage (Raid Group Stripe Size too small)................................................49

14.3 What should I do if BDM is causing my Jetstress test to fail?..................................................49

15 Common Issues......................................................................................................50

15.1 Log or Data Volumes cannot be overlapped...........................................................................50

15.2 Troubleshooting Jetstress.......................................................................................................50

15.2.1 Jetstress cannot attach to or create a database............................................................................50

15.2.2 Error loading Performance Monitor counters...............................................................................51

15.2.3 Database Performance counters not working after using Jetstress...............................................51

15.2.4 Unable to tune for the parameters...............................................................................................51

15.2.5 Unable to mount databases due to invalid mount point configuration.........................................52

Page Exchange Jetstress 2010, Field Guide, Version FinalPrepared by Neil Johnson"" last modified on , Rev

Page 10: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

1 PurposeThis document is intended to explain the process and requirements for validating an Exchange storage solution prior to releasing an Exchange deployment into production.

It will explain how Jetstress works, how to plan for and perform a Jetstress test, and how to analyze the results of the test.

This document is not intended to provide Exchange storage design guidance. For guidance on Exchange server design and planning refer to Mailbox Server Storage Design.

Page 1Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 11: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

2 Introduction to JetstressJetstress is a tool for simulating Exchange database I/O load without requiring Exchange to be installed. It is primarily used to validate physical deployments against the theoretical design targets that were derived during the design phase.

To accurately simulate the complex Exchange database I/O pattern, Jetstress makes use of the same ESE.DLL that Exchange uses in production. It is therefore vital that Jetstress use the same version of the Extensible Storage Engine (ESE) files that your Exchange infrastructure will be built with in production.

Ideally, Jetstress testing will be part of the overall project plan. The best time to schedule Jetstress testing is just before Exchange will be physically installed onto the servers.

Jetstress testing provides the following benefits prior to deploying live users.

Validates that the physical deployment is capable of meeting specific performance requirements

Validates that the storage design is capable of meeting specific performance requirements

Finds weak storage components prior to deploying in production Proves storage and I/O stability

The most important aspect of Jetstress testing is that it allows you to see how the physically deployed storage and server infrastructure will behave once a real Exchange workload is applied. This often works out differently from expectations, especially in scenarios where shared storage infrastructure is deployed or where the storage design is complex.

Often the Jetstress test will not provide the results that were expected. Sometimes by making subtle configuration changes to the storage infrastructure (for example, driver or firmware updates) it is then possible to get the test to pass.

Fundamentally, a successful Jetstress test validates that all of the hardware and software components within the I/O stack from the operating system down to the physical disk drive are working to a sufficient level to meet the predicted performance required by Exchange to operate successfully.

Note: The validity of your Jetstress testing is only as good as the user profile analysis and workload prediction that was completed during the design phase of the project.

Page 2Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 12: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

3 Jetstress Internals

3.1 Main Jetstress ComponentsLike Exchange, Jetstress is an ESE-based application. It runs in user memory space, makes API calls to ESE, which in turn makes calls to the Windows File system and I/O Manager to gain access to the data stored on disk. During each of these tasks Windows records performance information about the specific task and the operating system as a whole. Once the test is completed, Jetstress analyses the performance data to determine if the system meets the targets specified at the beginning of the test.

Figure 1 - Main Jetstress Components

3.1.1 Auto Tuning Component

This component is responsible for auto tuning within Jetstress. It attempts to determine the maximum thread count that the solution can support. Each thread performs a set amount of ESE calls, which generates a set amount of disk I/O. By raising or lowering the thread count per database the storage workload can be modified. The auto tuning component attempts to programmatically determine the maximum thread count that the storage solution can support; whilst remaining within the published disk latency guidelines for Exchange Server. The Jetstress test parameters for disk latency are shown in section 8.3.

Note: Auto tuning is generally unsuccessful if more than a single database is stored on the same set of physical disk spindles or the disk spindles are not dedicated to the Exchange Server; it is generally better to set the thread count manually.

Page 3Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 13: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

3.1.2 Thread Dispatcher

The thread dispatcher is responsible for managing workload within Jetstress. The main areas of interest within the thread dispatcher are as follows:

ThreadCount: number of transactional threads per database (prior to Exchange 2010, it used to be the number of threads per storage group).

ThreadTypes: each of those threads chooses to do one type of work against the database. The same thread can perform different types of work during a given run. There are four types: insert, read, update and delete (all of those against records on a table). The default operation mix for an Exchange 2010 simulation is: 40%, 35%, 5% and 20%, respectively.

SluggishSessions: the default is 1 for Exchange 2010. This is usually used to fine tune the amount of work performed by a given thread. Internally, a thread sleeps for (SluggishSessions * TaskRunTime) before picking up the next task to run. For example, if you have 3 for SluggishSessions and an insert thread took 100 msec in the last cycle, it will sleep for 300 msec before moving on to the next cycle. Of course, 0 means “go full throttle”.

3.1.3 Background Log Checksummer

This component simulates the I/O overhead of additional database copies. This copy operation has an I/O cost which increases with each additional copy.

3.1.4 Offline Log and Database Checksummer

This process checksums all database and log files at the end of a Jetstress run to ensure that all data is intact. It also provides performance data for CRC checksum speed should VSS copies require a checksum prior to backup.

This process is extremely hard on storage hardware, often applying an I/O load many times greater than the workload that the actual Jetstress test applies.

Important

If you are running Jetstress on multiple servers in parallel on shared storage infrastructure it is vital that the CRC check is not running while other servers are performing their Jetstress tests. Selecting the “multi-host” option during the test configuration causes the testing process to stop and wait for confirmation before beginning the CRC check to avoid servers interfering with each other’s results.

While working out the correct test configuration thread count to use it is not necessary to let the checksum part of the test complete. To stop the checksum you can either click on cancel, which will stop the checksum part of the test but still generate the performance test report, or edit the Jetstress configuration file and change the VerifyChecksum value to false (default is true).

<VerifyChecksum>false</VerifyChecksum>

Page 4Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 14: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

3.1.5 Reporting and Verification

At the end of a Jetstress test this process compares the observed performance results against a set of acceptable values. These results are then written to a HTML file. During the test binary performance data is written out to a BLG file.

Page 5Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 15: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4 Planning for JetstressJetstress testing is often misunderstood from a planning perspective. Particularly, how much time to allocate for testing and which part of the project should Jetstress testing occur?

4.1 Jetstress testing flow chartThe aim of the following process is to find the maximum workload while still passing the test. Fundamentally we need to increase workload until the test fails. The last value before failure is the highest workload that the system can support. If this value is below the design target, we then use sluggishsessions to fine tune the test.

The following process assumes that you are using the disk subsystem throughput test.

Figure 2 - Jetstress test flowchart

Note

This process is not recommended for shared storage deployments or while failure mode testing. In a shared storage deployment it is recommended to increase the thread count until the achieved transactional I/O per second is equal to or just above the design target.

Page 6Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 16: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.2 When should I run Jetstress in my project?Jetstress testing can often take place at multiple phases within the project plan. Depending on the design approach taken, Jetstress testing may be performed during both the planning (design) and build phases of a project.

Figure 3 - SDM phase overview

So, why would you run Jetstress during the planning/design phase of a project? The simple answer is that with today’s powerful hardware it is often the case that the Exchange design team must use standard “chunks” of hardware to create their design. Rather than attempt to guess what the I/O limits are of the hardware it is usually preferable to perform some Jetstress tests on the hardware to determine the maximum storage IO capacity of the system. This allows the design team to specify the bill of materials much more precisely, thereby saving money and reducing risk.

But if you have already proven the solution in the lab, why test again at build time? This is a common question; many projects only schedule sufficient time for testing a single server and its storage solution with the belief that they only need to validate the design. The problem with this approach is that it assumes a zero error rate in the build out. What happens if someone forgets a part of the build on one server? Or deploys a different device driver from the one used in the lab? What happens if a faulty piece of hardware has been deployed? Jetstress testing at build time is a great way to validate that the physically deployed hardware and software are capable of providing the required I/O performance for Exchange. Jetstress testing at build time is also a way to identify failing components such as disk drives; it is much less stressful to identify a weak batch of disks during a Jetstress test than on a Monday morning after a large user migration!

If the project plan will allow it, build in sufficient time to test each and every server and storage chassis that will be deployed before migrating user mailboxes to it. Remember that Jetstress can be fully automated, so with a little bit of planning it can be left to run overnight and may not actually add any significant overhead to the project.

Page 7Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 17: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.3 Where should I run Jetstress in my infrastructure?To ensure that the Jetstress test is representative of production, it is recommended to run Jetstress on every set of disks that will hold data mailbox database copy (active, passive or lagged). The test should be run on all servers and disks simultaneously.

Note: It is important to remember not to run Jetstress on production servers that have Exchange Server already installed. This may lead to problems with Exchange performance counters. It is recommended to run Jetstress BEFORE installing Exchange Server into production.

In the event that you have already installed and configured Jetstress on your production Exchange 2010 Servers, refer to the following article for more information on resolving Exchange Performance Counter problems

http://blogs.technet.com/b/mikelag/archive/2010/09/10/how-to-unload-reload-performance-counters-on-exchange-2010.aspx

Each database copy requires roughly 0.6x the active copy I/O to remain up-to-date; however the storage hosting the passive copy must be designed to provide sufficient I/O to support the copy if it were to become active. Therefore by testing each database LUN in parallel, we are validating that the storage solution is able to meet the design requirements. We are also validating that any pieces of shared infrastructure are able to meet the demand of the entire solution, rather than simply testing each server individually1.

1 Where there is no shared infrastructure and all storage is directly attached servers may be tested individually, however the test must be configured to include any active, replica or lagged LUNS that could become online at the same time to be a valid test.

Page 8Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 18: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.4 Testing Raid ArraysSince the improvements in Exchange I/O from Exchange 2007 onwards it is now viable to deploy Exchange Server databases on a multitude of storage arrays, from JBOD to Raid 6. Raid arrays offer a great compromise between data redundancy and performance; however they can also suffer from a significant performance reduction when operating in degraded mode (spindle failure). Due to this it is recommended to design Raid arrays that will host Exchange Server databases such that the Raid array should provide sufficient IOPS performance for the Exchange workload when running in degraded mode.

Important:

While testing for failure scenarios it is not necessary to run your Jetstress test at peak working load; instead it is recommended to modify the thread count until the Jetstress test achieves just above the Total Database Required IOPS / Server value reported in the Mailbox Role Calculator.

From a service perspective, it is important to validate that your storage can provide sufficient performance in all common failure conditions. Due to this it is recommended to run the Jetstress test while the array is in the following conditions.

Array Condition Test importance Description

Optimal Recommended for all deployments All disk spindles operating normally

Degraded Recommended for all deployments Single spindle removed from the array

Rebuilding Recommended if array has hot spare2. Failed spindle replaced and array controller is rebuilding the array

Ideally the Jetstress test should still pass during a degraded mode test. If the test fails, refer to this post to analyse the failure severity.

2 If your array does not have a hot spare you can choose to perform array rebuilds out of hours so the end user impact is minimized, however your data loss exposure is increased. If you plan on performing array rebuilds during working hours, even if you do not have a hot spare configured it is recommended to perform a Jetstress test run while the array is rebuilding.

Page 9Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 19: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.4.1 Example of a failed degraded mode test

This example shows an unacceptable test result. I have chosen to show an unacceptable result since a good test is basically just a flat line and isn’t particularly interesting. In this instance the storage was based on a Raid6 technology. The Jetstress test was configured to run at 1256 IOPS (Mailbox Role Calculator predicted 1200 IOPS). Approximately half way through the test a hard disk drive was (carefully) removed from the array and the spare began rebuilding.

The test data shows that the average read I/O latency increased from 11ms to 400ms+, with latency spikes of 3000-4000ms on the affected LUN. This situation took 18 hours to return to normal after the failure.

Important: Common failure modes such as a disk rebuild should not materially affect the test results.

Figure 4: Degraded mode failure

Note:

Please refer to the following section on understanding storage configuration for Exchange Server 2010 for more information on recommended raid configurations for Exchange Server.

http://technet.microsoft.com/en-us/library/ee832792.aspx

Page 10Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 20: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.5 Jetstress testing inside virtual machinesA Quick history lesson: Over the years we have seen a huge increase is deployments on hypervisor technology. During the early stages of hypervisor use for Exchange we worked with a number of customers who observed inaccurate results during their Jetstress tests of virtual machines. This culminated in the Exchange product group releasing a statement that advised against using Jetstress inside a virtual machine and instead to test on the root of the hypervisor – obviously this worked for Hyper-V, but was not quite so practical for all hypervisors. On 30th March 2012 after significant internal testing against modern hypervisors the Exchange Product group announced that it is now viable to perform your Jetstress testing directly from inside the virtual machines that are planned to host the Exchange Mailbox role.

The single caveat is that the hypervisor being used is one of the following or newer;

Microsoft Windows Server 2008 R2 (or newer) Microsoft Hyper-V Server 2008 R2 (or newer) VMware ESX 4.1 (or newer)

4.5.1 What’s different about Jetstress inside a virtual machine?

The approach and testing process do not change. The aim of the test is to validate that the storage presented to the virtual guest can provide sufficient performance to meet the predicted requirements from the mailbox role calculator. All performance counters and recommended values remain the same from a physical to a virtual guest and the recommendations for testing against raid arrays and in failure modes still apply.

However, there are some things that we may need to consider during our Jetstress testing.

1. Is the virtual host operating at a normal working load during our test? If the host has capacity for 10 virtual machines and we are testing with a single virtual machine running then there is the possibility that we will experience performance problems once the host is fully loaded.

2. Does the host server have any high availability technology that we need to test in degraded mode? This could include things like multiple paths to the storage or network, or maybe even a Hypervisor HA solution.

3. Follow the current recommended practices from both Microsoft and your hypervisor vendor. Yes I know this is obvious but it still amazes me how many problems are resolved by following the recommended guidance!

Guidance

The spirit of the test is to ensure that the system can meet its predicted workload during normal working conditions and also during any common failure modes for which the system has been designed to survive.

Page 11Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 21: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

For more information about virtualizing Exchange Server

Announcing Enhanced Hardware Virtualization Support for Exchange 2010http://blogs.technet.com/b/exchange/archive/2011/05/16/announcing-enhanced-hardware-virtualization-support-for-exchange-2010.aspx

Demystifying Exchange 2010 SP1 Virtualizationhttp://blogs.technet.com/b/exchange/archive/2011/10/11/demystifying-exchange-2010-sp1-virtualization.aspx

Best Practices for Virtualizing Exchange Server 2010 with Windows Server® 2008 R2 Hyper V™http://www.microsoft.com/download/en/details.aspx?id=2428

4.6 How much time should I allocate for Jetstress testing?Jetstress testing can take a long time to complete and it is vital that this time is correctly planned for within your Exchange project plan.

Generally the test procedure can be broken up into three parts.

Initialisation Testing Clean-up

4.6.1 Initialisation

This phase includes installation, prerequisites and initial database creation. Of these tasks the initial database creation will take the longest amount of time. Database creation time varies between hardware deployments however expect around 24 hours for 10TB of data.

4.6.2 Testing

The actual testing phase will vary depending on the complexity and maturity of the design. If your design is based on complex, cutting-edge storage technology it is highly likely that you will need to allocate more time for testing. If your design is based on common direct attached components the testing phase is likely to be quite short. For simple direct attached solutions allow between 2-5 days, for complex SAN solutions try to allocate up to 10 working days. Troubleshooting storage performance issues can often be very time-consuming.

4.6.3 Clean-up

Before the server can be put into production it is necessary to remove the Jetstress application and the test databases that were created. The recommended procedure is as follows

Uninstall Jetstress and Reboot Copy the Jetstress data to a safe location Delete the Jetstress installation folder

Page 12Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 22: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Remove all test databases

Depending on complexity, allow between 1 and 2 hours per Exchange server that needs to have Jetstress uninstalled.

Page 13Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 23: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.7 Preparing for the Jetstress testJetstress simulates an Exchange database workload. To ensure that the environment is ready it should be configured according to both the hardware vendor’s and Microsoft recommendations.

Refer to Understanding Exchange 2010 Storage Configuration for further detail.

As a starting point ensure that the following conditions have been met.

1. All devices on the storage system must be listed on the Windows Hardware Compatibility List (HCL) if you are running Windows Server 2008 R2 and Windows Server 2008. For more information, see Products Designed for Microsoft Windows - Windows Catalog, Windows Compatibility Center, and Windows Logo'd Product List.

2. If multiple clusters will be sharing any aspect of the disk subsystem, the server/storage configuration must be Cluster/Multi-Cluster Certified.

3. Verify with vendors that drivers and firmware are current. Drivers and firmware include, but are not limited to, the following items:

a. Server BIOS/firmwareb. SCSI/Array Controller firmware and driverc. Fibre Host Bus Adapter (HBA) firmware and driverd. Fibre switch/hub firmwaree. SAN (Storage Area Network) enclosure Operating System/Microcode/firmwaref. Hard disk firmware

4. Verify that the HBA/SAN specific configuration is set correctly. Many HBAs use registry keys to customize the configuration to a specific SAN platform (for example, Queue Depth).

5. Raid Controller Stripe size is 256Kb or greater (refer to hardware vendor for guidance).6. Read/Write Cache is 75% Write and 25% Read on all LUN’s.7. Configure the storage logical unit numbers (LUNs) (consider Exchange log devices and

database devices).8. Format the LUNs within Windows with NTFS file system. Best practice = 64k allocation unit

size.9. NTFS Compression is not enabled.10. File Level Anti-Virus is configured to exclude all Exchange data locations and any directories

that Jetstress has been configured to use. 11. Storport.SYS has been updated to the latest supported version for your hardware.

Page 14Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 24: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

4.8 What happens if the test fails?It is important to determine the pass and fail criteria for the test. The test will find the peak working load that the storage is able to provide at the I/O latency targets recommended by the Microsoft Exchange Team. These are defined in section 8.3.

If the recorded IOPS target from the Jetstress test is above the targets documented within the Exchange design then the storage solution is deemed to have passed the test. If it does not meet the design targets, then the storage solution is deemed to have failed the test.

If the test shows that the storage has failed to meet its design targets it will be necessary to perform remediation. This usually involves a combination of resources from the design/project, build, hardware, and storage vendor teams. The aim of remediation is to determine why the IOPS target was below the design target and to provide a remediation plan before submitting the solution for a re-test.

Before beginning significant storage redesign work, it is important to check the basics listed in section 4.6 Preparing for the Jetstress test. The most common causes of Jetstress test failures are missing simple configuration steps during deployment and/or misconfiguring the Jetstress test itself.

Advice:

It is much easier to resolve configuration problems during this phase of the deployment than after the Exchange servers have been put into production. It is far better to suffer a small delay to the project timescales than put a service into production that does not meet its original goals.

Page 15Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 25: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

5 Installing Jetstress

5.1 Online DocumentationThe documentation for the Exchange Server 2010 version of Jetstress is available on TechNet at the following location.

http://technet.microsoft.com/en-us/library/ff706601.aspx

5.2 Jetstress Version and Download

Version Build Usage Link

14.01.0225.017 32 bit Exchange 20033 http://www.microsoft.com/downloads/en/details.aspx?FamilyID=6c9c1180-4dd8-49c4-85fe-ca1cdcb2453c&displayLang=us

14.01.0225.017 64 bit Exchange 2007 Exchange 2010

http://www.microsoft.com/downloads/en/details.aspx?displaylang=en&FamilyID=13267027-8120-48ed-931b-29eb0aa52aa6

Table 1 - Jetstress version and download table

Note: Although there is a 32-bit build of Exchange 2007 it is not recommended or supported to use these ESE files to run a Jetstress test. This is due to the requirement for a 64-bit address space to accurately simulate a realistic Exchange I/O pattern.

Always ensure that you use the same version of Jetstress to initialise the databases and to perform the testing.

5.3 Prerequisites Microsoft .NET Framework 3.5 or higher A copy of your 64-bit production ESE files4

o ese.dllo eseperf.dllo eseperf.hxxo eseperf.inio eseperf.xml

It is extremely important that the version of ESE that is used for the test is the same version that will be used in production.

3 Refer to Appendix D – Exchange 2003 for information on configuring Jetstress 14.01.225.x for Exchange 2003

4 See section 5.4 for the locations of these files.Page 16

Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 26: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

5.4 Getting ESE Files necessary for JetstressJetstress requires ESE to function. The needed files are available from an installed Exchange server or from the Exchange installation media. It is recommended to get the files from an installed Exchange server that has been fully updated and patched. If you are deploying an Exchange server that is based on either RTM or SP1 code however, it is possible to get the necessary files directly from the installation media without requiring an Exchange installation.

Note: AMD64 refers to the x86-64 bit architecture and is not specific to AMD processors. Do NOT use the x86 files!

5.4.1 File locations from an installed Exchange Server5

File Path

ESE.DLL C:\Program Files\Microsoft\Exchange Server\V14\Bin

ESEPERF.DLL C:\Program Files\Microsoft\Exchange Server\V14\Bin\perf\AMD64

ESEPERF.HXX C:\Program Files\Microsoft\Exchange Server\V14\Bin\perf\AMD64

ESEPERF.INI C:\Program Files\Microsoft\Exchange Server\V14\Bin\perf\AMD64

ESEPERF.XML C:\Program Files\Microsoft\Exchange Server\V14\Bin\perf\AMD64

Table 2 - ESE file locations on running Exchange server

5.4.2 File locations from the installation media6

File Path

ESE.DLL \setup\serverroles\common

ESEPERF.DLL \setup\serverroles\common\perf\amd64

ESEPERF.HXX \setup\serverroles\common\perf\amd64

ESEPERF.INI \setup\serverroles\common\perf\amd64

ESEPERF.XML \setup\serverroles\common\perf\amd64

Table 3 - ESE file locations from installation media

5 These paths are for Exchange server 2010 SP1

6 These paths are for Exchange server 2010 SP1Page 17

Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 27: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

5.5 InstallationBefore performing this section it is recommended that all prerequisites have been met and that Exchange server is not installed on any servers being used for Jetstress testing.

5.5.1 Application Installation

# Instruction Screenshot

1. Begin Jetstress installation

2. Accept License agreement

Page 18Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 28: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

3. Leave the installation options as default unless you have a good reason to change them.

Note: All performance data and HTML reports will be stored in the installation folder.

4. This is the last chance to stop the installation. Click on “Next” to install…

Page 19Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 29: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

5. Once installation is completed click on “Close”.

Table 4 - Jetstress installation instructions

5.5.2 ESE File Installation

# Instruction Screenshot

1. Copy ESE prerequisite files into the Jetstress installation folder.

By default this is “c:\Program Files\Exchange Jetstress”

Page 20Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 30: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

2. Click on the “Start” menu and open “Exchange Jetstress 2010”

Note: Jetstress requires local Administrator access. If user access control is enabled ensure that you start the JetstressWin.EXE process as an administrator.

3. Click on “Start new test”

4. Jetstress will attempt to use the ESE files that were copied over in step 1. The first time that this occurs Jetstress must be restarted. Verify in the output on this screen that the ESE version is correct and that the last line of the status output requires that Jetstress be restarted.

Close Jetstress

This is the end of the Jetstress installation.

Page 21Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 31: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Table 5 - ESE installation instructions

Page 22Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 32: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

6 Configuring JetstressFor the purposes of this document we will be configuring a disk subsystem throughput test. The goal of this test is to identify the peak working IOPS value that the storage subsystem can sustain while remaining within the disk latency targets established by the Exchange Product Group.

6.1 Jetstress Test Types

6.1.1 Test a disk subsystem throughput

This test uses some fixed parameters to determine the maximum storage performance at maximum working capacity (80%). This is the recommended test type since it identifies the maximum working load of the storage solution for use with Exchange Server 2010 while the disks are filled to capacity. The values observed from this test can be used both to qualify the solution ready for production and to calculate available system I/O headroom once the service is in production. This test should be regarded as mandatory for each and every Exchange server released into production.

6.1.2 Test an Exchange mailbox profile

Helps you determine whether your storage system meets or exceeds the planned Exchange mailbox profile. In the Exchange mailbox profile test scenario, you can specify the number of mailbox users, IOPS per mailbox and quota size to simulate the profiled Exchange mailbox load. This test type can be useful if your storage has been specifically designed to operate only at a specific disk capacity7.

Note: Even if this test type is used, it is still recommended to complete the disk subsystem throughput test to determine the maximum working load of the storage solution at full capacity.

7 It is not recommended to design Exchange storage performance based on less than 80% utilisation capacity.Page 23

Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 33: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

6.2 Initial configuration

# Instruction Screenshot

1. Click on the “Start” menu and open “Exchange Jetstress 2010”

2. Click on “Start new test”

Page 24Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 34: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

3. Check that the status text does not ask for a restart and that the last two lines state that the ESE engine and performance libraries were detected.

4. Since this is the first time we are configuring a test we will accept the defaults and click next.

This will create a new configuration file called JetstressConfig.xml in the default installation directory.

5. Select the “Test disk subsystem throughput” test and click “next”

Page 25Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 35: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

6. Check supress tuning and use the thread count per database. To configure this value correctly refer to Appendix A – Configuring Thread Count.

7. Configure the test for “performance”, do not configure “multi-host” unless the test is running against a shared storage solution. If the Exchange database design will have Background Database Maintenance (BDM) disabled (it is enabled by default), then uncheck the Run BDM checkbox.

For DAS deployments accept the defaults.

8. Enter in the folder for storing the test results and set the correct duration for Jetstress. Performance tests should be run a minimum of 2 hours.

Note: While you are figuring out the correct thread count to use, you can set a shorter than 2 hour test by typing directly into the window.

0.75 = 45m 0.50 = 30m 0.25 = 15m

9. Configure the test to represent the production deployment.

Number of databases should be the total on this server including all

Page 26Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 36: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

database copies.

Number of copies per database represents the number of total copies that will exist for each unique database. This value simply simulates some LOG I/O reads to account for the log shipping between active and passive databases – it does NOT actually copy logs between servers.

For example, if your 6 server DAG contained 30 databases, with 1 active copy, 2 passive HA copies and 1 lagged copy per database (or 120 database copies spread across 6 servers, with each server hosting 20 copies), you would set the Number of Databases to 20 and the Number of copies per database to 4.

10. Configure the database and log file paths appropriately.

Scroll to the bottom of this page to find the “next” link.

11. If this is the first time the test has been run select to “create new databases”, otherwise select “Attach existing databases”.

Page 27Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 37: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

12. Verify that the paths are as expected and click “Prepare test”

13. This will begin database initialisation – this process will vary but plan on 24 hours for every 10TB worth of data to be initialised.

This value should equate to 80% of the available storage. Refer to section 8.2.2 for further information on database sizes.

Page 28Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 38: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

14. Once the test has been initialised, click “Execute Test”.

15. Once the test has completed, close Jetstress and copy the Jetstress report and performance data somewhere for analysis.

Each performance test will generate the following files.

Performance_<date>.XML Performance_<date>.HTML Performance_<date>.BLG DBChecksum_<date>.XML DBChecksum_<date>.HTML DBChecksum_<date>.BLG XMLConfig_<date>.XML

Ensure that you make a copy of all of these files.

Note: In addition you may also wish to make a copy of the *.EVT files which contain event log data taken during the test.

Table 6 - Jetstress initial configuration

Page 29Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 39: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

7 Jetstress Output FilesThis section will explain what output files will be created after the test and what they are intended for.

File Content Purpose

Performance_<date>.BLG Binary performance data captured during the performance test.

To provide detailed data for analysis. Open this file in perfmon and examine the counters manually to understand reasons for failure.

Performance_<date>.XML XML Report for the performance test

Provides the status report data in XML format.

Performance_<date>.HTML HTML Report for the performance test

Provides an easy to read status report for the test.

DBChecksum_<date>.BLG Binary performance data captured during the checksum test.

Provides binary performance data gathered during the CRC checksum of the database. Useful if the checksum fails or takes a long time to complete.

DBChecksum_<date>.XML XML Report for the checksum test Provides status report data in XML format.

DBChecksum_<date>.HTML HTML Report for the checksum test

Provides an easy to read status report for the checksum test.

XMLConfig_<date>.XML XML Configuration File Provides a backup of the Jetstress Configuration file used for the test.

Table 7 - Jetstress output files

Page 30Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 40: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8 Reading Jetstress report dataThis section will walk through a very simple sample report, and explain where the key values are stored and how to interpret the data.

8.1 Target design valuesBefore we can evaluate our Jetstress data we need to know what our design targets are. Assuming that the storage design was based on data from the Mailbox Role calculator, the information we need is in the following table on the Role Requirements tab.

Make a note of the following values.

Total Database Required IOPS / Server

8.2 Reading the Jetstress Test Result ReportThe following report is for a test with 5 databases configured.

8.2.1 Test Summary

This section is a basic summary of the test, when it started, finished and which versions of operating system and ESE were used.

The most important part of this section is the overall test result!

Page 31Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 41: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.2.2 Database Sizing and Throughput

This section shows some more detailed parameters regarding the test. A “test disk subsystem throughput” test report will always show 100% for Capacity Percentage and Throughput Percentage. In this example 5 x 10GB LUN’s were used, Jetstress created 8GB Databases to test against which is only 80% of the available space. This is normal behaviour; by default, in performance mode Jetstress will use 80% of the disk capacity to allow room for growth during the test process.

The most important value in this section is the Achieved Transactional I/O per Second.

Note:

To validate that the test has met the design requirements compare the Achieved Transactional I/O per Second from your Jetstress report to the Total Database Required IOPS / Server value recorded in section 8.1 from the Mailbox Role Calculator

8.2.3 Jetstress System Parameters

This section displays some system values that Jetstress used for this test. The important values for analysis here are the thread count and number of copies per database.

Page 32Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 42: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.2.4 Database Configuration

This section lists the paths for each database and log combination. In this example only one database was configured. Check that all of the test databases are listed here and the path names are correct.

8.2.5 Transactional I/O Performance

This section of the report displays the Transactional I/O values that were achieved for each database. Transactional I/O does not include I/O for Background Database Maintenance.

BDM I/O is mostly sequential so it is not usually considered during the design phase.

Note:

if you sum the values highlighted in the red box it should add up to the Achieved Transactional I/O per second.

8.2.6 Background Database Maintenance I/O Performance

Page 33Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 43: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

This section displays the I/O that was used to perform Background Database Maintenance only. The sum of values in the red box shows the total amount of IO used for BDM operations.

8.2.7 Log Replication I/O Performance

This section displays the I/O overhead for LOG file replication. In this example there were no replica copies (replicas=1), this is shown by a zero count for I/O Log Reads/sec. If this value is greater than zero it confirms that database replication is being simulated.

Note:

Yes one day I will include an example report with LOG IO in it

Page 34Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 44: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.2.8 Total I/O Performance

This table shows all I/O that was recorded during the test (transactional I/O plus BDM I/O plus LOG I/O). The summation of I/O values from areas highlighted in red in this table should agree (roughly) with those observed at the storage subsystem.

In this case, the summation suggests that the storage subsystem had to deal with a total of 518 IOPS. However, more than half of those IOPS were sequential and so were not considered as transactional I/O which is what we are interested in. Sequential I/O is very easy on the disk subsystem.

The following chart shows the observed IOPS from the Windows test host during the Jetstress test. This counter includes all system IOPS as well as the test IOPS; however there should be a strong correlation between the two metrics. In this example we observed 529 IOPS at the host.

Figure 5 - Host observed IOPS

It is vital to differentiate between sequential IOPS and transactional (Random) IOPS when validating your storage.

Page 35Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 45: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

We are only interested in transactional IOPS when we are jetstress testing – BDM and LOG IO are sequential in nature and so we ignore them from a performance planning perspective for Exchange Server.

Often storage teams are confused by the results of a Jetstress test since the achieved transactional I/O per second value is much lower than the observations they make at the storage system. It is important to differentiate between the workloads.

Note:

It is an invalid approach to sum the values displayed in the Total I/O performance table and compare them to the Total Database Required IOPS / Server predicted by the Mailbox Role calculator. The only value from the Jetstress report that is required for validation is Achieved Transactional I/O per Second. All other values are for support and curiosity only!

8.2.9 Host System Performance

This section of the report shows the observed system performance during the test. This section is most often used for troubleshooting. The most important thing to note from this section is that the CPU load from Jetstress is usually minimal. Jetstress has been optimized to evaluate the storage subsystem and not the host performance itself.

Page 36Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 46: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.2.10 Test Log

This section of the report is a log of the Jetstress test. It is most often used for troubleshooting or to record how long a test took to complete various stages.

Page 37Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 47: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.3 Interpreting Jetstress test resultJetstress evaluates latency values for Database Reads and LOG writes since these affect the end user experience.

Performance Test “Strict” mode (<= 6 hour test)

Average Database Read Latency: 20ms Average Log File Write Latency: 10ms Max Database Read Latency: 100ms Max Log File Write Latency: 100ms

Stress Test “Lenient” mode (> 6 hour test)

Average Database Read Latency: 20ms Average Log File Write Latency: 10ms Max Database Read Latency: 200ms Max Log File Write Latency: 200ms

Note: For further information about performance counters on the Exchange 2010 Mailbox Role, refer to the following page

http://technet.microsoft.com/en-us/library/ff367871.aspx

Note:

There is a bug in version 14.01.0225.017 of Jetstress that may cause a test to be run in the wrong mode. Reference the following blog for details.

http://blogs.technet.com/b/neiljohn/archive/2011/06/09/jetstress-2010-stress-test-problem.aspx

A fix is expected in the next release.

Page 38Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 48: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

8.4 Test evaluationEvaluate the following criterion for each test run. The first two tests are validated against the design targets and must be performed manually; Jetstress does not validate these values. The third and fourth are against pre-defined latency targets for Exchange, if these values are not within tolerance, Jetstress will report the test as failed.

1. DB IOPS Target: Is the Achieved Transactional I/O per Second in the test report higher than the Total Database Required IOPS / Server predicted in the Mailbox Role Calculator?

2. Is the I/O Database Reads Average Latency in the test report <20ms?3. Is the I/O Log Writes Average Latency in the test report <10ms?

DB IOPS Target

DB Read Latency

LOG Write Latency

Action

PASS PASS PASS Test successful

FAIL PASS PASS The test is failing to meet the IOPS target, but the latency values are good. Increase the thread count by 1 and re-test. Use sluggishsessions to fine tune if necessary.

PASS FAIL FAIL At least one database has recorded latency over threshold. If the latency values are very close to limits increase sluggish sessions by 1, if both target IOPS and latency values are much higher decrease the thread count.PASS PASS FAIL

PASS FAIL PASS

FAIL FAIL FAIL If the test shows that Achieved IOPS is below the design target AND the test latency values are above limits the storage solution is unable to meet the requirements. At this stage it is necessary to re-evaluate the storage design and begin troubleshooting the physical deployment to determine the correct remediation.

FAIL FAIL PASS

FAIL PASS FAIL

Table 8 - Quick results analysis table

Page 39Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 49: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

9 Appendix A – Configuring thread countThread count controls how many IOPS Jetstress attempts to drive through the storage subsystem. Setting this value correctly requires some trial and error. For the process described within this document the goal is to increase the thread count to a value that fails and then reduce the value until the test passes, this should then represent the peak working IOPS value that the storage subsystem can support.

Each thread will generate the same quantity of IOPS per database. So for example, if the storage design team recommended that the storage for a given server was able to support 1000 IOPS and that server hosted 6 active databases and 3 replica databases we would use the following formula for a starting point.

Target IOPS = 1000 Total Databases = Active (6) + Replica (3) + Lagged (0) = 9

Starting thread count = TargetIOPS

(Total Databases×65 )

Given this example…

Starting thread count = 1000

(9×65 ) = 1.7 (round up to 2)

Notes:

If in doubt start with thread=1 and work up until the test fails.

If the thread count predicted is less than 1 it may be necessary to modify the sluggishsessions value afterwards.

The exact quantity of IOPS generated per thread will change as the storage system workload changes. As the storage system gets closer to its performance limit the IOPS per thread value will reduce. Jetstress was designed to produce approximately 65 IOPS/DB per thread at 20ms disk latency.

Page 40Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 50: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

10 Appendix B – Configuring sluggishsessionsIf it is not possible to achieve the right IOPS value by modifying the thread count it becomes necessary to modify the sluggishsessions value within the JetstressConfig.xml file.

The sluggishsessions value adds a pause between each task. This allows a level of fine-tuning over the workload dispatched by Jetstress.

As sluggishsessions is increased the achieved IOPS value decreases.

To change the value, open the JetstressConfig.xml file and look for the default configuration option

<SluggishSessions>1</SluggishSessions>

Modify the value, save the configuration file and then re-start Jetstress.

Page 41Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 51: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

10.1 Lab test data for SluggishSessionsThe following test data is aimed as a guide to help configuring thread count and sluggishsessions. It is provided as a guide to the target IOPS values that Jetstress will attempt to reach for a given configuration. As latency increases the achieved IOPS values will diminish. This data was taken from a single database running on a Raid10 LUN with 24x2.5” 10k SAS HDD’s.

DB IOPSLOG IOPS

Total IOPS (excl BDM)

BDM IOPS

Avg DB Read

Latency

Avg DB Wri te

Latency

DB IOPSLOG IOPS

Total IOPS (excl BDM)

BDM IOPS

Avg DB Read

Latency

Avg DB Wri te

Latency

DB IOPSLOG IOPS

Total IOPS (excl BDM)

BDM IOPS

Avg DB Read

Latency

Avg DB Wri te

Latency

1 120 42 162 29 7.5 2.2 89 31 120 28 7.2 4.4 64 20 84 28 7.2 5

2 214 70 284 30 7.7 1.1 146 48 194 30 7.3 1.4 106 34.4 140.4 28 7.9 3.2

3 490 105 595 30 5.9 1.1 218 67 285 30 7.8 1.3 154 48 202 29 8.1 1.7

4 679 136 815 30 6.1 1.5 289 84 373 30 7.9 1 235 69 304 30 7.6 1.2

5 811 153 964 30 6.2 1.1 534 110 644 30 6.3 1.3 289 84 373 30 7.8 1.2

6 947 165 1112 30 6.3 1.2 631 125 756 30 6.4 1.2 486 99 585 30 6.3 1.4

7 1117 189 1306 30 6.4 1.2 740 146 886 30 6.5 1.1 544 109 653 30 6.3 1.2

8 1220 195 1415 30 6.6 1 835 156 991 30 6.5 1.2 610 118 728 30 6.5 1.49 1305 204 1509 30 7 1 944 160 1104 30 6.5 1 723 136 859 30 6.3 1.1

SluggishSessions=1 SluggishSessions=2 SluggishSessions=3Thread Count

0

200

400

600

800

1000

1200

1400

1600

1 2 3 4 5 6 7 8 9

Tota

l IO

PS (E

xclu

ding

BDM

)

Thread Count

Effects of SluggishSession on IOPS workload

SluggishSessions=1 Total IOPS (excl BDM)

SluggishSessions=2 Total IOPS (excl BDM)

SluggishSessions=3 Total IOPS (excl BDM)

Figure 6 - Effects of SluggishSessions on IOPS workload

Page 42Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 52: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

11 Appendix C - Running a Jetstress Test with JetstressCmd.exeBoth JetstressWin.exe and JetstressCmd.exe use the common Jetstress core library files, which means you will have comparable test results with the same XML configuration file. We recommend that you use JetstressWin.exe to create new test scenarios, and JetstressCmd.exe to open and run the test scenarios by using the /config command-line option. You can also see all the other available options by using the /? (help) command-line option.

Action Argument Example of Use Description

help /? The help for the command-line program

Config /c JetstressConfig.xml Open a configuration file

Generate /g Generate a sample XML configuration file

TimeOut /TimeOut 2H0M0S Test Duration. Default is 2 hours.

Output /output c:\output Path for test output. Default is the current directory.

DBPath /dbpath m:\sg1\mdb /dbpath n:\sg2\mdb

Database paths for each storage group

LogPath /log x:\sg1\log y:\sg2\log Log path for each storage group

PctCapacity /pctcapacity 100 Specify capacity percentage

Throughput /throughput 100 Specify throughput percentage

Threads /threads Suppress auto tuning and specify thread count

DoNotRunDBPerformance Do not run background database maintenance during performance/stress test

RunDBPerformance Run background database maintenance during soft recovery test

New /new Create new databases

Open /open Open existing databases

Page 43Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 53: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Bak /bak Restore backup database

Recovery /recovery Run soft recovery test

Streaming Run streaming backup test

Transaction Run transaction performance test

VerifyCheckSum Run database checksums

Page 44Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 54: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

12 Appendix D – Exchange 2003One of the major changes after Exchange Server 2003 was the removal of storage groups. When you begin to configure your Jetstress test with Exchange Server 2003 ESE files you will notice that there is no way to configure storage groups in the test.

Information

A storage group in Exchange Server 2003 was simply a group of databases that shared the same log stream.

To simulate an Exchange Server 2003 storage group it is necessary to configure multiple databases with the same log folder path.

In the following example two storage groups are being simulated with two databases in each one.

Storage Group 1 (Log Path G:\)

Database 1 Database 2

Storage Group 2 (Log Path H:\)

Database 3 Database 4

Page 45Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 55: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

13 Appendix E – Running Jetstress on a production serverAlthough the formal support position on this is that you shouldn’t do it – ever – at all – under no circumstances – in fact you shouldn’t even be reading this section of the field guide … however, we all accept there are cases where it can be necessary, such as when attaching new storage to an existing server or troubleshooting performance bottlenecks on existing servers.

That still doesn’t mean it’s ok to do it!!

If you really MUST do it, here are some things to know before beginning…

Record the start-up state of all Exchange Services. Stop and Disable all Exchange Services on the server. Copy the ESE files from the currently installed version of Exchange server – Jetstress will

detect that the performance counters are already installed for this version of ESE and will use them, this will prevent performance counter problems afterwards!

Do not unload/reload performance counters after the test (if you have used the same ESE files as are currently installed this is unnecessary and could break things!).

Remember to clean up the Jetstress test databases after testing. Uninstall Jetstress. Set Exchange Services back to the state they were in before you began testing. Reboot your Exchange Server. Inspect Exchange Performance counters are working. Inspect Windows System and Application Event logs for errors.

Remember: This is not supported or recommended – only follow this as a matter of last resort or under the instruction of Microsoft Support/Microsoft Consulting Services.

Page 46Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 56: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

14 Appendix F – Exchange 2010 BDMExchange Server 2010 brought with it some changes to database maintenance. One of the least understood of these new features is Background Database Maintenance.

Note: further information in Exchange Database Maintenance can be found on the Exchange Team blog.

14.1 What is Background Database Maintenance anyway and why do I need it?Exchange maintains physical database page integrity by generating a CRC checksum and storing it on each page of the database as it is written. This checksum is then re-calculated and compared to the stored page CRC value every time the page is read from disk. This allows the Exchange store process to determine if the data has been changed or corrupted since it was written. This process is a vital part of how Exchange maintains database integrity and its ability to detect data corruption.

Back when we did streaming backups to tape, the Exchange store would read every single page of the database during each full backup. This meant that we performed a full CRC checksum of the database for each full backup that was performed, this both validated that our databases were reliable and ensured that the data that we had written to tape was physically valid.

When we moved to Exchange Server 2007 we deprecated ESE streaming backups. When we take a non-streaming backup, the data movement is performed via VSS and not the Exchange store; this means that although we benefit from much faster backups, we lose the individual page CRC validation during backup. This issue was solved in Exchange 2007 by enabling something called DBScan, which essentially allocated 50% of the database maintenance window to performing CRC page checksumming. This process could run on both active and passive database copies, meaning that we could be sure that all of the pages in our Exchange databases were stored as we intended them to be – regardless of our backup technology or schedule.

With Exchange Server 2010, the database structure was altered radically and we also no longer supported streaming backups. Exchange Server 2010 also brought some changes in database maintenance. One of these changes was the introduction of Background Database Maintenance. This process performs physical database page CRC checksumming while the database is online. CRC page checksumming is an extremely sequential operation; the process begins at the first database page and works through the database, in order, until it reaches the end. This process runs no more than once in a 24 hour period and ensures that the physical database pages are not corrupt on every replica database copies in our deployment (in the default configuration). Additionally the BDM process is able to detect unreadable physical disk sectors and lost flushes

Page 47Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 57: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Changes to the Exchange Server 2010 Database and Maintenance processes are discussed here.

http://technet.microsoft.com/en-us/library/bb125040.aspx

Additionally, there is a level 300 webcast by Matt Gossage (Exchange PG) that explains even more about how Exchange Server 2010 uses storage.

https://msevents.microsoft.com/CUI/WebCastEventDetails.aspx?culture=en-US&EventID=1032418921&CountryCode=US

14.2 So why does it cause problems for some types of storage?The BDM database checksumming process is designed to take advantage of large sequential I/O’s. BDM has a hard coded throttle of a 20ms delay between each request, which on most deployments results in each Exchange database requiring roughly 5MB/sec in BDM throughput.

The result of this is that the BDM process issues 256KB I/O reads, sequentially to your chosen disk storage at a rate of 5MB/sec per database. If you have deployed locally attached disk spindles and configured your Raid Stripe Size to be 256KB+ then it is unlikely that BDM will have any noticeable effect on your storage performance.

However there are some circumstances where BDM can have a noticeable effect on storage performance…

14.2.1 iSCSI attached storage

iSCSI is a block level network attached storage solution. As such the storage is limited by the network connection speed between the server and the storage host. Given that we require 5MB/sec throughput for BDM per Exchange database, it is evident that there are certain limitations in scalability.

If we take a recent example of an Exchange 2010 DAG deployment; where each Exchange Server Mailbox node hosted 40 databases and each node was connected to a separate iSCSI storage array.

40 Databases x 5MB/sec = 200MB/Sec

This is obviously more than a single 1Gbit/sec connection could sustain and so the service would need to be deployed on a 10Gbit/sec solution or multiple 1Gbit/sec links joined together, significantly increasing costs.

Important:

When deploying on iSCSI take BDM into account where possible

Page 48Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 58: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

14.2.2 Incorrectly configured storage (Raid Group Stripe Size too small)

If we take another example from the field; this time an Exchange Server DAG deployed against a FC SAN. The storage had not been configured according to the Exchange Server 2010 recommendations and instead of sending large 256KB I/O requests for BDM to the individual disk spindles; they were being broken up into 64KB I/O requests by the storage controller. This meant that the disk spindles were required to provide more IOPS and predictably this caused the storage to fail the Jetstress test while BDM was enabled.

14.3 What should I do if BDM is causing my Jetstress test to fail?Firstly review your storage configuration against the recommended configuration here.

http://technet.microsoft.com/en-us/library/ee832792.aspx

Make sure that you have configured your Raid Group Stripe size appropriately and that you are not hitting storage throughput limits!

In the event that your chosen storage is not capable of meeting the requirements with BDM enabled it is recommended that reconsider your storage solution or configuration. It is possible to disable BDM on the Active Databases (Passive are permanently enabled), however this approach is not recommended since Jetstress cannot simulate a multi-copy DAG with BDM disabled; additionally it may put your deployment at risk due to insufficient CRC checksum scanning.

This issue is discussed further here:

http://blogs.technet.com/b/neiljohn/archive/2011/06/06/jetstress-2010-and-background-database-maintenance.aspx

Page 49Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 59: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

15 Common Issues

15.1 Log or Data Volumes cannot be overlappedThis issue can arise when using Exchange Server 2007 ESE files and you attempt to store multiple LOG streams on the same logical disk volume. Although this is a valid Exchange server deployment, Jetstress uses the logical disk counters to record I/O latency values for each Database and Log stream. If multiple Databases and Logs are using the same logical disk then Jetstress is unable to isolate the performance counters required.

To work around this issue, edit the JetstressConfig.XML file and change <useEseCounters> to be true.

<UseEseCounters>true</UseEseCounters>

Save the XML File and attempt to re-configure the test with overlapped files. Jetstress will now allow the configuration since it has been configured to use the I/O performance data from ESE rather than the Operating System Logical Disk Counters.

15.2 Troubleshooting JetstressWhile using Jetstress, you may encounter some known issues with Jetstress. This section provides possible causes, and the recommended solutions.

15.2.1 Jetstress cannot attach to or create a database

Event log error that may display: Error -1023

Possible cause: The path of the database or log files is incorrect. Solution: Ensure that the paths and file names are correct.

Event log error that may display: Error -1032

Possible cause: Permissions are insufficient to access the .edb file or the log files. Solution: Verify that permissions are sufficient for the account under which Jetstress is

running. Jetstress requires read/write permission to the directories it is using.

Event log error that may display: Error -550 (0)

Possible cause: The last time Jetstress was run, it was ended uncleanly. This caused the log files to become unsynchronized with the database.

Solution: Delete the Jetstress database (*.edb), log files (*.log), and check file (*.chk), and re-create the Jetstress database. You can also use Eseutil.exe with the /r switch to resynchronize the logs and database.

Event log error that may display: Error -1022

Possible cause: The failure is caused by circular logging by Jetstress.

Page 50Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 60: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

Solution: Check the log drive for the log file name that is identified in the event log. Delete that log file and all the log files that have a higher number in the file name. Then, run Eseutil.exe /r to recover Jetstress.edb. When the database is in a good state, delete all the log files in the log directory, and rerun Jetstress.

15.2.2 Error loading Performance Monitor counters

JetstressWin.exe relies on performance counters to monitor the system. JetstressWin.exe requires the ESE database counters to be installed.

Cause: When the counters are not loaded correctly, you may see exception errors related to performance counters.

Solution: To reload the counters, exit from JetstressWin.exe. Locate the directory where JetstressWin.exe was installed and verify that eseperf.dll, eseperf.hxx, and eseperf.ini files exist in the directory. In a command shell window, type the command unlodctr ESE and then click Enter. This will unregister the ESE Database performance counters. Start JetstressWin.exe and allow it to reload the performance counters.

15.2.3 Database Performance counters not working after using Jetstress

Uninstalling Jetstress causes performance counter issues.

Cause: In certain cases when Exchange was installed before Jetstress, uninstalling Jetstress could result in unstable Database Performance counters.

Solution: To restore access to database counters after Jetstress has been removed, go to the Exchange installation directory (eg. D:\*\Exchange Server\V14\bin) from a command prompt.

o From the command prompt, run the command unlodctr ESE to ensure that the database performance counter registration information has been completely removed from the registry.

o Run the command lodctr esperf.ini, which should correctly register the database counters.

The database performance counters will be available the next time.

15.2.4 Unable to tune for the parameters

This error indicates that Jetstress could not find appropriate parameters that could be used to run a performance or stress test at the desired level of I/O load.

Cause: This can be caused by several factors. The most common reason is that the storage subsystem has multiple hosts attached to it, and those hosts are competing for common resources during the tuning process.

Solution: When you are running in a scenario such as this, you can run Jetstress on a single host with tuning enabled to generate the appropriate load parameters, and then rerun the test on the other hosts with the Suppress Tuning option enabled and the tuning parameters entered manually from the results of the first test.

Page 51Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7

Page 61: Jetstress 2010 · Web view2012/03/27 · Jetstress 2010 Jetstress Field Guide Tuesday, 27 March 2012 Version 1.0.0.16 [Issued] Prepared by neil.johnson@microsoft.com Revision and

000Exchange Community0

15.2.5 Unable to mount databases due to invalid mount point configuration

When using mount points and running the Prepare phase of Jetstress, the operation fails with error “There is insufficient disk space on volume <system drive>:\” , where <system drive> is the drive letter where you keep your root mount folder.

Cause: This error means that one or more of the mount points is invalid or the mount point folder path is not connected to its LUN. Database creation fails saying that volume C: (or in general, the system volume) does not have enough space. The issue here is that some of the mount-points mapped to directories in the system volume are not properly configured and so Jetstress is looking at the directory (thus checking against the system drive itself), rather than the actual disk.

Troubleshooting: Execute a DIR command in the mount point root folder.

ALL mount point folder paths are indicated by a <JUNCTION> notation. Any folder that is listed as a <DIR> is not attached to its mount point and is likely causing the problem.

Solution: The mount path folder could be listed as <DIR> for a number of reasons:1. Verify the LUN is present and in good health.2. Using the storage system array management software, verify the LUN has an assigned

logical drive.3. Using the Disk Management MMC, re-assign the LUN to the correct mount-point.

Page 52Jetstress 2010, Jetstress Field Guide, Version 1.0.0.16 DraftPrepared by [email protected]"document.docx" last modified on 27 Mar. 12, Rev 7