sg24-6363 - ibm totalstorage fastt best practices september 2004

ibm.com/redbooks

Front cover

IBM TotalStorage FAStT Best Practices

Bertrand DufrasneAndreas Groth

Arthur LettsAndrew Meakin

Peter Mescher

FAStT concepts and planning

Implementation, migration, and tuning tips

VMware ESX Server support

http://www.redbooks.ibm.com/


International Technical Support Organization


September 2004

SG24-6363-00

© Copyright International Business Machines Corporation 2004. All rights reserved.Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP ScheduleContract with IBM Corp.

First Edition (September 2004)

This edition applies to VMware ESX Server 2.1 ad IBM TotalStorage FAStT products that were current as of July 2004.

Note: Before using this information and the product it supports, read the information in “Notices” on page xv.

Contents

Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xvTrademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiThe team that wrote this redbook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviiiBecome a published author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xixComments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiJuly 2004, First Edition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Part 1. FAStT introduction, planning, configuration, and maintenance . . . . . . . . . . . . . . . . . . . . . . . . . 1

Chapter 1. Introduction to FAStT and SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 FAStT features and models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 FAStT Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2.1 FAStT Storage Manager components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.2 New features introduced in Storage Manager Version 8.4 . . . . . . . . . . . . . . . . . . . 8

1.3 Introduction to SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3.1 SAN components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Chapter 2. FAStT planning tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.1 Planning your SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.2 SAN zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Zone types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.2 Zoning configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Physical components and characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.1 Rack considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.3.2 Cables and connectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3.3 Cable management and labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.3.4 Fibre Channel adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.3.5 Planning your storage structure and performance . . . . . . . . . . . . . . . . . . . . . . . . 272.3.6 Logical drives and controller ownership . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.3.7 Segment size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382.3.8 Storage partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.3.9 Cache parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.3.10 Hot-spare drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.3.11 Remote Volume Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.4 Additional planning considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4.1 Planning for systems with LVM: AIX example. . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.4.2 Planning for systems without LVM: Windows example. . . . . . . . . . . . . . . . . . . . . 492.4.3 The function of ADT and a multipath driver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.4.4 ADT alert notification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.4.5 Failover alert delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Chapter 3. FAStT configuration tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573.1 Preparing the FAStT Storage Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.1.1 Network setup of the controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

© Copyright IBM Corp. 2004. All rights reserved. iii

3.1.2 Installing and starting the FAStT Storage Manager Client . . . . . . . . . . . . . . . . . . 593.1.3 Updating the controller microcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.2 FAStT cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.2.1 FAStT600 cabling configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.2.2 FAStT700/FastT900 cabling configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2.3 Expansion unit numbering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.3 Configuring the FAStT Storage Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703.3.1 Defining hot-spare drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.3.2 Creating arrays and logical drives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.3.3 Configuring storage partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Chapter 4. FAStT maintenance tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 774.1 Performance monitoring and tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.1.1 The performance monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.1.2 Tuning cache parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.2 Controlling the performance impact of maintenance tasks . . . . . . . . . . . . . . . . . . . . . . 804.2.1 Modification operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.2.2 Remote Volume Mirroring operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 814.2.3 VolumeCopy priority rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.2.4 FlashCopy operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.3 Event monitoring and alerts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.3.1 FAStT Service Alert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.4 Saving the subsystem profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874.5 Upgrades and maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.5.1 Being up-to-date with your drivers and firmware using My support . . . . . . . . . . . 884.5.2 Prerequisites for upgrades . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.5.3 Updating FAStT host software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 894.5.4 Updating microcode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.6 Capacity upgrades, system upgrades. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.6.1 Capacity upgrades and increased bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.6.2 System upgrade and disk migration procedures. . . . . . . . . . . . . . . . . . . . . . . . . . 934.6.3 Other considerations when adding expansion enclosures and drives . . . . . . . . . 95

Part 2. Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

Chapter 5. Migrating 7133 to FAStT in AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 995.1 Performance sizing considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5.1.1 SSA adapters and FAStT adapters performance comparison . . . . . . . . . . . . . . 1005.2 Sizing a solution based upon the SSA configuration . . . . . . . . . . . . . . . . . . . . . . . . . 1025.3 Sizing a solution based upon the application IO rates . . . . . . . . . . . . . . . . . . . . . . . . 1025.4 Sizing a solution for more performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.5 Setting up the FAStT prior to migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.5.1 Install FAStT software on the AIX host server . . . . . . . . . . . . . . . . . . . . . . . . . . 1045.6 Performing the migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

5.6.1 Logical Volume Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.6.2 Migration procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.6.3 Mirroring the VG (method 1) illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.6.4 Migrating PVs (method 2) illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.6.5 Other considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.6.6 HACMP considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

Chapter 6. IBM migration services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.1 Data migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.2 Piper: IBM hardware-assisted data migration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

iv IBM TotalStorage FAStT Best Practices

6.3 What is Piper Lite? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.3.1 Piper hardware appliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1216.3.2 Piper Migration Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226.3.3 Piper Migration Surveyor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236.3.4 Data migration process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Chapter 7. FAStT and HACMP for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1277.1 HACMP introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1287.2 Supported environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

7.2.1 General rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1307.2.2 Configuration limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Chapter 8. FAStT and GPFS for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1338.1 GPFS introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1348.2 Supported configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

Part 3. VMware and FAStT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Chapter 9. Introduction to VMware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1419.1 VMware, Inc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1429.2 The IBM and VMware relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1429.3 VMware ESX Server v2.1 architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1439.4 VMware ESX Server storage structure: disk virtualization . . . . . . . . . . . . . . . . . . . . . 146

9.4.1 Local disk usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1469.4.2 SAN disk usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1479.4.3 Disk virtualization with VMFS volumes and .dsk files . . . . . . . . . . . . . . . . . . . . . 1479.4.4 The Buslogic and LSI SCSI controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1499.4.5 The complete picture of ESX and FAStT storage . . . . . . . . . . . . . . . . . . . . . . . . 150

9.5 FAStT and ESX Server solution considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1519.5.1 General performance and sizing considerations. . . . . . . . . . . . . . . . . . . . . . . . . 1519.5.2 Which model of FAStT should be used in a VMware implementation? . . . . . . . 1529.5.3 FAStT tuning considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

9.6 IBM eServer BladeCenter and ESX Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1559.6.1 Introduction to the IBM eServer BladeCenter . . . . . . . . . . . . . . . . . . . . . . . . . . . 1569.6.2 BladeCenter disk storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Chapter 10. VMware ESX Server terminology, features, limitations, and tips. . . . . . 16310.1 Storage Management: naming conventions and features. . . . . . . . . . . . . . . . . . . . . 164

10.1.1 Disks and LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16410.1.2 Failover paths and failover policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16610.1.3 Adapter bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17110.1.4 VMFS or raw disks? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17210.1.5 Booting from SAN and use of NAS devices . . . . . . . . . . . . . . . . . . . . . . . . . . . 17310.1.6 Direct attached storage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17310.1.7 FAStT premium features: RVM, FlashCopy, VolumeCopy . . . . . . . . . . . . . . . . 17410.1.8 Network considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Chapter 11. VMware ESX Server storage configurations . . . . . . . . . . . . . . . . . . . . . . 17911.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18011.2 Using redundant paths from the switches to the FAStT . . . . . . . . . . . . . . . . . . . . . . 181

11.2.1 Recommended configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18111.2.2 Workaround for LUN discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

11.3 Configurations by implementation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18611.3.1 Single HBA with single switch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

Contents v

11.3.2 Multiple HBAs with multiple switches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18711.4 Configurations by function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

11.4.1 Independent VMFS volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19111.4.2 Public VMFS volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19111.4.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

11.5 Zoning options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19411.6 BladeCenter specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

11.6.1 The configuration tested in this redbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

Chapter 12. Installing VMware ESX Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19712.1 Assumptions and requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19812.2 HBA configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19912.3 Fibre switch configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

12.3.1 Single path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20112.3.2 Dual path: crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20212.3.3 Zoning the Qlogic FCSM of a BladeCenter. . . . . . . . . . . . . . . . . . . . . . . . . . . . 20312.3.4 Considerations for attaching multiple hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

12.4 FAStT NVSRAM settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20612.5 LUN configuration and storage partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

12.5.1 Configurations without LUN sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20912.5.2 Configurations with LUN sharing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

12.6 Verifying the storage setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21312.7 ESX Server installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

12.7.1 Text mode installation of ESX Server (BladeCenter without USB mouse) . . . . 21512.7.2 Installation of ESX using the Graphical Installer . . . . . . . . . . . . . . . . . . . . . . . . 230

12.8 Configuring the ESX Server swap file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24212.8.1 Swap file: required steps for blade servers with IDE drives . . . . . . . . . . . . . . . 24412.8.2 Creating and activating the swap file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247

12.9 Configuring the virtual network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25012.9.1 Configuring the network for all systems except HS20 blade servers . . . . . . . . 25012.9.2 Configuring the virtual network on HS20 blade servers . . . . . . . . . . . . . . . . . . 252

12.10 ESX Server advanced settings and Qlogic parameters . . . . . . . . . . . . . . . . . . . . . 26212.10.1 ESX Server advanced settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26212.10.2 Adjust Qlogic driver parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

12.11 Planning disk resources and creating VMFS partitions. . . . . . . . . . . . . . . . . . . . . . 26512.11.1 Considerations and guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26512.11.2 Creating VMFS partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

12.12 Creating virtual machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27312.12.1 Creating the virtual disk resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27512.12.2 Modifying the disk resources as shared drives for clustering . . . . . . . . . . . . . 279

12.13 Guest OS specific settings and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28412.13.1 Clustered Windows guest OS settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28412.13.2 Installing MSCS 2003 on a hybrid cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

Chapter 13. Redundancy by configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28713.1 Single path configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28813.2 Dual path redundant configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Related publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291Other publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292

vi IBM TotalStorage FAStT Best Practices

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

Contents vii

viii IBM TotalStorage FAStT Best Practices

Figures

1-1 The IBM TotalSTorage FAStT series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41-2 Positioning the FAStT Storage family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61-3 What is a SAN? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121-4 SAN components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132-1 SAN zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162-2 9306 Enterprise rack space requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192-3 Small Form Factor Transceiver (left) and GBIC (right) . . . . . . . . . . . . . . . . . . . . . . . 212-4 Dual SC fiber-optic plug connector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212-5 SFF hot-pluggable transceiver (SFP) with LC connector fiber cable . . . . . . . . . . . . . 222-6 Fibre Channel LC-SC adapter cable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222-7 AIX: Four paths to FAStT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262-8 Load sharing approach for multiple HBAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262-9 Iterative steps for system performance tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272-10 RAID 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282-11 RAID 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292-12 RAID 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302-13 RAID 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312-14 No channel protection versus channel protection . . . . . . . . . . . . . . . . . . . . . . . . . . . 352-15 Automatic configuration feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352-16 Balancing LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362-17 Preferred controller ownership. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372-18 Redistribute logical drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372-19 Conceptual model of disk caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412-20 Default values used by the Create Logical Drive Wizard. . . . . . . . . . . . . . . . . . . . . . 422-21 Remote Volume Mirroring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452-22 Intersite high availability solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462-23 Different functional levels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472-24 AIX LVM conceptual view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482-25 Inter-disk allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 492-26 Example of alert notification in MEL of an ADT/RDAC logical drive failover . . . . . . . 532-27 Changing the failover alert delay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552-28 Failover Alert Delay dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553-1 Initial Automatic Discovery. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603-2 Enterprise Management window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 603-3 First launch of the Subsystem Management window. . . . . . . . . . . . . . . . . . . . . . . . . 613-4 FAStT600 cabling configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633-5 FAStT600 connected through managed hub or Fibre switches. . . . . . . . . . . . . . . . . 643-6 Dual FAStT600 connected through Fibre switches . . . . . . . . . . . . . . . . . . . . . . . . . . 643-7 Dual expansion unit Fibre Channel cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653-8 Rear view of the FAStT900 Storage Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663-9 Connecting hosts directly to the controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663-10 Using two Fibre Channel switches to connect a host . . . . . . . . . . . . . . . . . . . . . . . . 673-11 FAStT900 drive-side Fibre Channel cabling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 694-1 Capacity scaling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914-2 Increasing bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914-3 EXP700 ESM board diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924-4 The DACstore area of a FAStT disk drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944-5 Upgrading FAStT controllers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

© Copyright IBM Corp. 2004. All rights reserved. ix

4-6 Migrating RAID arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955-1 Output from iostat command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035-2 Storage Manager client 8.4 for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1065-3 Logical Drive Properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1105-4 Logical Volume Manager architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125-5 Relationships between LP and PP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135-6 Sample lsvg output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155-7 FAStT mirror using LVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1176-1 A screen from Piper Migration Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236-2 A screen from the Piper Migration Surveyor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236-3 Piper Lite connected in customer system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1247-1 HACMP cluster with attachment to FAStT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1308-1 Simple GPFS model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1349-1 VMware ESX Server architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1449-2 FAStT disk virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1469-3 The ESX Server Console OS disks and partition. . . . . . . . . . . . . . . . . . . . . . . . . . . 1469-4 FAStT logical volumes to VMware VMFS volumes . . . . . . . . . . . . . . . . . . . . . . . . . 1479-5 VMFS volumes and .dsk files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1489-6 ESX Server .dsk modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1499-7 The logical disk structure of ESX Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1509-8 Unrealistic storage consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1539-9 Potentially realistic storage consolidation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1549-10 FC switch (illustrated) or OPM installation in a BladeCenter chassis . . . . . . . . . . . 1599-11 BladeCenter with FC switch modules direct to FAStT . . . . . . . . . . . . . . . . . . . . . . . 1619-12 BladeCenter with FC switch modules and external FC switches (fabric) to FAStT . 16210-1 Disks and LUNs view. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16410-2 Failover Path view - single HBA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16710-3 Sample configuration for path thrashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16910-4 Changing the path settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17010-5 Path example for dual path - crossover configuration . . . . . . . . . . . . . . . . . . . . . . . 17110-6 Adapter Bindings window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17210-7 Virtual Ethernet Switch principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17510-8 Available physical (outbound) network adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . 17710-9 Port groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17811-1 LUN discovery with multiple paths to each controller . . . . . . . . . . . . . . . . . . . . . . . 18111-2 Multiple paths using extra mini-hubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18211-3 Multiple paths using an inter-switch link (ISL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18311-4 The switch to FAStT configuration that was tested for the redbook. . . . . . . . . . . . . 18411-5 LUN discovery in ESX Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18511-6 Single server with single HBA configuration sample . . . . . . . . . . . . . . . . . . . . . . . . 18611-7 Multiple independent servers with single HBAs configuration sample. . . . . . . . . . . 18711-8 Single server with dual HBAs configuration sample . . . . . . . . . . . . . . . . . . . . . . . . 18811-9 Multiple servers with multiple HBAs - configuration sample . . . . . . . . . . . . . . . . . . 18911-10 Multiple servers sharing a storage partition configuration sample . . . . . . . . . . . . . . 19111-11 Local virtual machine cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19211-12 Split virtual machine cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19311-13 Single switch with multiple zones. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19411-14 Multiple switches with single zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19511-15 Multiple switches with multiple zones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19511-16 BladeCenter configuration sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19612-1 Host adapter selection in the Fast!UTIL setup program . . . . . . . . . . . . . . . . . . . . . 19912-2 Restoring the default settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19912-3 Displaying the Adapter Port Name. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

x IBM TotalStorage FAStT Best Practices

12-4 Switch example - single path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20112-5 Zoning example - single path. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20112-6 Switch example - dual path crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20212-7 Zoning example - dual path crossover. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20212-8 Adding a fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20312-9 BladeCenter SAN Utility - Main view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20412-10 Zoning sample of FCSM1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20412-11 Zoning sample of FCSM2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20512-12 Activate zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20512-13 Use LNXCL as host type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20712-14 LUN creation with Storage Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20812-15 Example of combined shared and unshared logical drives . . . . . . . . . . . . . . . . . . . 20912-16 Scan Fibre Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20912-17 Select host type LNXCL for the ESX host ports. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21012-18 Storage partitioning - single server / dual path attachment without LUN sharing . . 21012-19 Multiple independent servers / dual path attachment without LUN sharing . . . . . . . 21112-20 Scan Fibre Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21212-21 Select host type LNXCL for the ESX host ports. . . . . . . . . . . . . . . . . . . . . . . . . . . . 21212-22 Storage partitioning - two servers with dual path attachment and LUN sharing . . . 21312-23 Host adapter selection in the Fast!UTIL setup program . . . . . . . . . . . . . . . . . . . . . 21412-24 Scan Fibre Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21412-25 Result of the Scan Fibre Device showing the FAStT controller . . . . . . . . . . . . . . . . 21412-26 VMware Welcome screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21612-27 No USB mouse present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21612-28 ESX Server support site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21712-29 custom install selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21712-30 Keyboard selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21712-31 Mouse selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21812-32 End user license agreement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21812-33 ESX server and SMP Serial numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21912-34 Disk Partitioning setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21912-35 Hard Disk Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22012-36 Network Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22112-37 Root Password. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22112-38 Adding user ID’s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22212-39 List of User IDs, you can add additional users later. . . . . . . . . . . . . . . . . . . . . . . . . 22212-40 Installation Log Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22312-41 Installation progress screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22312-42 ESX Server 2.1 server installation complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22312-43 VMware ESX Server console screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22412-44 Security certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22412-45 Login to Vmware Management Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22512-46 Cancel the wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22512-47 VMware Management Interface incomplete installation swap space warning. . . . . 22612-48 VMware Management Interface - Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22612-49 Startup Profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22712-50 Reboot screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22812-51 Standby screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22912-52 LILO Boot Menu. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22912-53 Welcome screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22912-54 Welcome screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23012-55 ESX 2.1 GUI installation welcome screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23112-56 Custom installation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231

Figures xi

12-57 Keyboard setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23212-58 Mouse setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23212-59 License acceptance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23312-60 ESX Server Serial number entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23312-61 Device Allocation sample. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23412-62 Hard Disk partitioning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23512-63 Confirm remove all partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23612-64 Default Partition Allocation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23612-65 Network Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23812-66 Time Zone configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23812-67 Account Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23912-68 Installation is about to start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23912-69 Installation complete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24012-70 LILO Boot Menu. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24012-71 VMware welcome screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24112-72 Security certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24212-73 Login to Vmware Management Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24312-74 VMware Management Interface - swap space warning . . . . . . . . . . . . . . . . . . . . . . 24312-75 Options screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24412-76 Available LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24512-77 VMFS Volume creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24512-78 Creation of Core Dump Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24612-79 VMFS Logical drive name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24612-80 Core Dump and VMFS created . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24712-81 Reconfigure... swap space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24712-82 Swap Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24812-83 Creating the swap file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24812-84 Swap configured but not activated. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24912-85 VMware Management Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24912-86 Warning for unconfigured network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25012-87 Configuring the Virtual Ethernet Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25112-88 Creating a virtual network switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25112-89 Configured virtual network switch with two adapters . . . . . . . . . . . . . . . . . . . . . . . . 25212-90 vmkpcidivy -i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25412-91 vmkpcidivy -i cont. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25412-92 vmkpcidivy -i cont. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25512-93 Verifying adapter assignment using the MUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25612-94 Editing hwconfig to enable bond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25712-95 Options screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25812-96 Configured virtual network switch with two adapters . . . . . . . . . . . . . . . . . . . . . . . . 25912-97 Network setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26012-98 Enable trunking on external ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26112-99 Enabling switch failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26112-100 Adjusted ESX settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26312-101 Editing the hwconfig file using a remote putty.exe session . . . . . . . . . . . . . . . . . . 26312-102 Adding Qlogic driver options in hwconfig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26412-103 Verify new Qlogic settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26512-104 LUNs mapped to host group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26912-105 LUNs mapped to ESX server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26912-106 Options screen. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27012-107 Available LUNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27112-108 VMFS Volume creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27212-109 VMFS Logical drive name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

xii IBM TotalStorage FAStT Best Practices

12-110 Core Dump and VMFS created . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27312-111 Adding a vm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27412-112 Select OS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27412-113 Specify resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27512-114 Select type of virtual disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27612-115 Create new VMFS based disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27612-116 Finish device configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27712-117 Assigning a raw disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27812-118 Finish device configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27912-119 Sample config for clustering using raw disks before modifications . . . . . . . . . . . . 28012-120 Setting SCSI Controller 1 to mode shared virtual . . . . . . . . . . . . . . . . . . . . . . . . . 28112-121 SCSI Controller 1 configured for shared virtual . . . . . . . . . . . . . . . . . . . . . . . . . . . 28112-122 Sample config for clustering using raw disks before modifications . . . . . . . . . . . . 28212-123 Setting SCSI Controller 1 to mode shared physical . . . . . . . . . . . . . . . . . . . . . . . . 28312-124 SCSI Controller 1 configured for shared physical . . . . . . . . . . . . . . . . . . . . . . . . . 28312-125 Using regedt32 to create TimeOutValue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28412-126 Editing the value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28512-127 Adding the second node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28512-128 Error on adding node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28612-129 Avoiding installation errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Figures xiii

xiv IBM TotalStorage FAStT Best Practices

Notices

This information was developed for products and services offered in the U.S.A.

IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service.

IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive Armonk, NY 10504-1785 U.S.A.

The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you.

This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice.

Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk.

IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you.

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

Screen shots reprinted by permission from VMware Inc.

This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental.

COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrates programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. You may copy, modify, and distribute these sample programs in any form without payment to IBM for the purposes of developing, using, marketing, or distributing application programs conforming to IBM's application programming interfaces.

© Copyright IBM Corp. 2004. All rights reserved. xv

TrademarksThe following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:

AIX 5L™AIX®BladeCenter™Buslogic®DFS™Enterprise Storage Server®Eserver®Eserver®FICON®FlashCopy®

HACMP™ibm.com®IBM®iSeries™Netfinity®POWER™Predictive Failure Analysis®pSeries®Redbooks (logo) ™Redbooks™

RS/6000®SANergy®ServeRAID™ServerProven®ServicePac®Tivoli®TotalStorage®X-Architecture™xSeries®z/OS®

The following terms are trademarks of other companies:

VMware and ESX Server are trademarks of VMware, Inc.

Intel and Intel Inside (logos) are trademarks of Intel Corporation in the United States, other countries, or both.

Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.

Java and all Java-based trademarks and logos are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Other company, product, and service names may be trademarks or service marks of others.

xvi IBM TotalStorage FAStT Best Practices

Preface

This IBM Redbook represents a compilation of best practices for configuring FAStT and gives hints and tips for an expert audience on topics such as GPFS, HACMP, Clustering, VMWare ESX Server support, and FAStT migration. It is an update and replacement for the IBM Redpaper, REDP3690.

Setting up a FAStT Storage Server can be a complex task. There is no single configuration that will be satisfactory for every application or situation.

Part 1 provides the conceptual framework for understanding FAStT in a Storage Area Network and includes recommendations, hints, and tips for the physical installation, cabling, and zoning. Although no performance figures are included, we discuss the performance and tuning of various components and features to guide you when working with FAStT.

Part 2 presents and discusses advanced topics, including a technique for migrating from 7133 SSA disks to FAStT, High Availability Cluster Multiprocessing (HACMP™) and General Parallel File System (GPFS), in an AIX® environment, as they relate to FAStT.

Part 3 is dedicated to the VMware ESX 2.1 Server environment deployed on either IBM Eserver xSeries® or BladeCenter™ equipment and provides substantial information for different configurations and attachment to FAStT.

This book is intended for IBM technical professionals, Business Partners, and customers responsible for the planning, deployment, and maintenance of IBM TotalStorage FAStT products.

© Copyright IBM Corp. 2004. All rights reserved. xvii

The team that wrote this redbookThis redbook was produced by a team of specialists from around the world working at the International Technical Support Organization, San Jose Center.

Bertrand Dufrasne is a Certified Consulting I/T Specialist and Project Leader for Disk Storage Systems at the International Technical Support Organization, San Jose Center. He has worked at IBM in various I/T areas. Before joining the ITSO, he worked for IBM Global Services in the U.S. as an I/T Architect. He holds a degree in Electrical Engineering

Andreas Groth is the Lead Engineer of the EMEA xSeries Advanced Technical Support (ATS) and a certified Senior IT Specialist, located in Greenock, Scotland. He has over 8 years experience in designing and supporting Intel® based solutions. He currently acts as the official technical presales lead for VMware in EMEA. Besides virtualization technologies, his areas of expertise are server consolidation, SAN, and complex project support. He holds a degree in Electronic and Mechanical Engineering.

Andy Meakin is an xSeries Technical Sales Specialist with the IBM Systems Group in Australia. He has over 6 years experience (including 2.5 years with IBM) in technical sales support for Intel solutions, architectures, and related products. He uses this experience to deliver creative IBM solutions for key clients in Western Australia especially in the areas of server and storage consolidation.

Peter Mescher is a Product Engineer for the IBM Solutions Central Support Team in Research Triangle Park, North Carolina. He has 4 years of experience in the Storage Area Networking field. He holds a degree in Computer Engineering from Lehigh University. His areas of expertise include SAN problem determination, switching/routing protocols, and host configuration. He is one of the co-authors of the Storage Networking Industry Association Level 3 (Specialist) certification exam.

Arthur Letts is an IT Specialist with IBM South Africa. He works in Infrastructure and Systems Management Services, a department in Integrated Technology Services. He has 8 years experience in Windows® Client/Server computing and 1 year in Linux.

Thanks to the following people for their contributions to this project:

Yvonne Lyon International Technical Support Organization, San Jose Center

Chuck GrimmJoakim HanssonAlex CandelariaDanh LeMatthew DarlingtonDan BradenBruce AllworthRainer WolafkaJeffrey Cauhappe

xviii IBM TotalStorage FAStT Best Practices

John HartmanJim GoodwinJonathan WrightMassimo ChiriattiMassimo Re Ferre’Fabiano MatassaBob SeayIBM Corporation

David WorleyEngenio

Lance BercOlivier LecomteJohn HawkinsVMware

Become a published authorJoin us for a two- to six-week residency program! Help write an IBM Redbook dealing with specific products or solutions, while getting hands-on experience with leading-edge technologies. You'll team with IBM technical professionals, Business Partners and/or customers.

Your efforts will help increase product acceptance and customer satisfaction. As a bonus, you'll develop a network of contacts in IBM development labs, and increase your productivity and marketability.

Find out more about the residency program, browse the residency index, and apply online at:

ibm.com/redbooks/residencies.html

Comments welcomeYour comments are important to us!

We want our Redbooks™ to be as helpful as possible. Send us your comments about this or other Redbooks in one of the following ways:

� Use the online Contact us review redbook form found at:

ibm.com/redbooks

� Send your comments in an Internet note to:

[email protected]

� Mail your comments to:

IBM® Corporation, International Technical Support OrganizationDept. QXXE Building 80-E2650 Harry RoadSan Jose, California 95120-6099

Preface xix

http://www.redbooks.ibm.com/residencies.html

http://www.redbooks.ibm.com/residencies.html



http://www.redbooks.ibm.com/contacts.html

xx IBM TotalStorage FAStT Best Practices

Summary of changes

This section describes the technical changes made in this edition of the book and in previous editions. This edition may also include minor corrections and editorial changes that are not identified.

Summary of Changesfor SG24-6363-00for IBM TotalStorage FAStT Best Practicesas created or updated on June 2, 2005.

July 2004, First EditionThis book replaces the IBM Redpaper, REDP3690, and reflects the addition, deletion, or modification of new and changed information described below.

New information� Procedure for migration to FAStT under AIX� Piper Lite service offering� VMware ESX Server 2.1 attaching to FAStT

Changed information� Various updates to reflect changes in the FAStT product line� Reorganized the structure of existing chapters

© Copyright IBM Corp. 2004. All rights reserved. xxi

xxii IBM TotalStorage FAStT Best Practices

Part 1 FAStT introduction, planning, configuration, and maintenance

Starting with a brief review of the IBM TotalStorage FAStT product line, the different models and their characteristics, and an introduction to Storage Area Networks (SAN), this part of the book is essentially dedicated to provide guidance and explain best practices for the planning, implementation, and maintenance of a FASTT solution.

Part 1

© Copyright IBM Corp. 2004. All rights reserved. 1

2 IBM TotalStorage FAStT Best Practices

Chapter 1. Introduction to FAStT and SAN

This chapter introduces the IBM TotalStorage FAStT Storage Server products with a brief description of the different models, their features, and where they fit in terms of a storage solution. This chapter also summarizes the functions of the FAStT Storage Manager software and emphasizes the features of Storage Manager Version 8.4.

Finally, this chapter reviews some of the basic concepts and topologies of Storage Area Networks as we refer to these in other parts of the book.

1


1.1 FAStT features and modelsIBM TotalStorage Fibre Array Storage Technology (FAStT) Storage Server is a Redundant Array of Independent Disks (RAID) storage subsystem that contains the Fibre Channel (FC) interfaces to connect both the host systems and the disk drive enclosures. The FAStT family provides an excellent platform for storage consolidation in entry, midrange, and enterprise class open systems.

The FAStT Storage Server provides high system availability through the use of hot-swappable and redundant components. This is crucial when the Storage Server is placed in high-end customer environments on Storage Area Networks (SANs).

The FAStT systems include built-in features at no cost, like autonomic functions such as Dynamic Logical Drive Expansion and Dynamic Capacity Addition, allowing unused storage to be brought online without stopping operations, and FAStT Service Alert, which is capable of automatically alerting IBM if a problem occurs. Most models also offer great disk flexibility by allowing mixed disk capacities.

Depending on the model, premium features can be bought at a one time charge; these premium features include Storage Partitioning — the ability to map logical drives for access by specific hosts only — and firmware based copy services consisting of FlashCopy® (for quick, point-in-time image of a logical drive), VolumeCopy (cloning of a logical drive) and Remote Volume Mirroring (real time data replication between FAStT subsystems over a remote distance, for business continuance and disaster recovery).

Figure 1-1 Shows the characteristics and the evolution of the IBM TotalStorage FAStT series.

Figure 1-1 The IBM TotalSTorage FAStT series

FAStT200 FAStT200

3U, single/dual 1 GB controller(s)RAID cache - 128 MB single controllerCache upgrade -256 MB single controllerMax drives - 60 SL/HH2 host FC ports

FAStT100 FAStT100Dual 2 GB controller(s)RAID cache - 512 MB Serial ATA drivesMax drives - 56 SL 4 host FC ports

FAStT600 FAStT600 Single/dual/Turbo Single/dual/Turbo

FAStT600 with TurboFAStT600 with Turbo

Dual 2 GB controller(s)RAID cache - 2 GB Max drives - 1124 host FC portsCopy service options30-100%+ performance boost over the base FAStT600

FAStT700 FAStT700

Dual 2 GB controller(s)RAID cache - 2 GBMax drives - 224 4 host mini hubsCopy services options

FAStT900FAStT900

Dual 2 GB controller(s) RAID cache - 2 GB Max Drives - 224 4 host mini hubsCopy services optionsHighest performance

Performance, Availability, Scalability, Redundancy

The Range EvolvesThe Range Evolves

FAStT600 FAStT600 Single/dual/Turbo Single/dual/Turbo

FAStT600 with TurboFAStT600 with Turbo

Dual 2 GB controller(s)RAID cache - 2 GB Max drives - 1124 host FC portsCopy service options30-100%+ performance boost over the base FAStT600


The IBM TotalStorage FAStT series includes the following components:

� FAStT100 Storage Server:

The IBM TotalStorage FAStT100 Storage Server is a disk solution designed to provide small and medium-sized companies with a high performance, long-term storage solution for infrequently accessed data (that is, archived data). With up to 14 TB of Serial ATA (SATA) physical disk storage — provided by up to 14 internal 250 GB SATA disk drives inside the controller and up to three EXP100s — the FAStT100 can provide ample yet scalable storage without the cost of extra expansion units. With four Fibre Channel ports to attach to servers, the IBM TotalStorage FAStT100 inherits many of the advanced functions and capabilities of the FAStT600, but is optimized for near-line storage with low-cost SATA drive technology.


The FAStT200 is designed for workgroup and departmental servers that require an external storage solution. The single controller model provides a cost-effective solution, and the FAStT200 High Availability (HA) model features a fully redundant configuration with dual-active controllers. The FAStT200 enclosure holds 10 drives. As your storage requirements grow, you can easily expand storage capacity by adding IBM FAStT EXP500 or EXP700. Expansion units scale from 18 GB to 1.47 TB in a compact 3U size with a maximum system capacity of 9.6 TB. FlashCopy is supported and provides fast data duplication capability, reducing or eliminating the need for long shutdowns during backups and restores.

� FAStT600 and FAStT600 Turbo Storage Servers:

The FAStT600 is an entry-level (now available as either single or dual controller versions), highly scalable Fibre Channel storage server with a 2 Gbps host interface.

It is designed to be a cost-effective, scalable storage server for consolidation and clustering applications. Its modular architecture can support on demand business models by enabling an entry configuration that can easily grow as storage demands increase. The dual controller model supports up to 56 drives (up to 8.2 TB capacity using three EXP700 Expansion units).

The FAStT600 Turbo is a mid-level storage server that can scale to over 16 TB, facilitating storage consolidation for medium-sized customers. It provides an end-to-end 2 Gbps Fibre Channel solution (the host interface on base FAStT600 is 2 Gbps, and Turbo auto senses to connect to 1 Gbps or 2 Gbps). It has higher scalability over the base FAStT600, up to 16.4 TB for a total of 112 disks, using a maximum of seven EXP700s. Alternatively, the FAStT600 Turbo supports attachment of up to eight EXP100 Expansion Units, and this allows for up to 112 disk drives, offering a potential capacity of 28 TB of Serial ATA disk storage. Using EXP100 expansion units requires FAStT Storage Manager V8.42. The FAStT600 Turbo supports up to 64 storage partitions.

The cache has increased from 256 MB per controller on base FAStT600 to 1 GB per controller on Turbo.


The FAStT700 can scale from 36 GB to greater than 32 TB of storage, using 16 EXP700 expansion enclosures. Each expansion enclosure supports up to fourteen 2-Gbps Fibre Channel disk drives.

Note: There is no initial support for the intermix of FC and Serial ATA drawers. Use of EXP100 units requires FAStT Storage Manager V8.41.

Chapter 1. Introduction to FAStT and SAN 5

This storage server supports high-end configurations with up to 64 heterogeneous host systems. The IBM TotalStorage FAStT700 Storage Server is designed to support high availability, providing protection against component failures. Dual hot-swap RAID controllers help provide high throughput and redundancy; each controller supports 1 GB of battery-backed cache. Redundant fans, power supplies, and dynamic storage management further support high availability and help reduce the risk of costly down time or the loss of valuable data.


The FAStT900 Storage Server delivers breakthrough disk performance and outstanding reliability for demanding applications in compute-intensive environments. The FAStT900 is designed to offer investment protection with advanced functions and flexible features. Designed for today’s on demand business needs, the FAStT900 easily scales from 36 GB to over 32 TB to support growing storage requirements. The FAStT900 is an effective storage server for any enterprise seeking performance without borders.

The FAStT900 uses 2 Gb Fibre Channel connectivity to support high performance (up to 795 Mbps throughput from disk) for faster, more responsive access to data. It provides flexibility for multiplatform storage environments by supporting a wide variety of servers, operating systems, and cluster technologies (certified for Microsoft® Cluster Services, Novell clustering, HACMP, Veritas Cluster for Solaris). This storage server is well suited for high-performance applications such as online transaction processing (OLTP), data mining, and digital media.

Figure 1-2 shows how each IBM TotalStorage FAStT model fits into its category of Open System Computing, from entry level to enterprise level, and Table 1-1 shows their relative positioning.

Figure 1-2 Positioning the FAStT Storage family


Table 1-1 FAStT models and characteristics at-a-glance

1.2 FAStT Storage ManagerFAStT Storage Manager is the software used to manage FAStT Storage Servers. The current version at the time of writing is Version 8.42. (This release supports all FAStT Storage Servers with general released firmware versions from 04.00.02.xx up to 05.42.xx.xx. Note that specific FAStT models, like the FAStT100 with SATA disks, require specific firmware levels.) With this program, you can configure arrays and logical drives, assign your logical drives to storage partitions, replace and rebuild failed disk drives, expand the size of arrays, and convert from one RAID level to another. You can also perform troubleshooting and management tasks, such as checking the status of FAStT Storage Server components and updating the firmware of RAID controllers.

1.2.1 FAStT Storage Manager componentsFAStT Storage Manager includes the following components:

� FAStT Storage Manager Client:

This is the graphical interface used to configure, manage, and troubleshoot the FAStT Storage Server. It can be installed on either the host system or a managing workstation.

FastT 100 200 HA 600 SCU 600 600 Turbo 700 900

Environment Workgroup Workgroup Workgroup Workgroup Departmental Departmental Enterprise

Max disks 561 60 14 56 112 224 224

Max raw capacity 14 TB 8.8 TB 2.0 TB 8.2 TB 16.4 TB 32.7 TB 32.7 TB

Host interfaces 2 Gbps 1 Gbps 2 Gbps 2 Gbps 2 Gbps 2 Gbps 2 Gbps

SAN attach (max) 4 FC-SW 2 FC-SW 2 FC-SW 4 FC-SW 4 FC-SW 4 FC-SW 4 FC-SW

Direct attach (max) 1 FC AL 2 FC AL 2 FC AL 1 FC AL 1 FC AL 8 FC AL2 8 FC AL2

Max cache memory 256 MB/cont 128 MB/cont 256 MB/cont 256 MB/cont 1 GB/cont 1 GB/cont 1 GB/cont

IOPS from cache read - 11800 45500 45500 77500 110000 148000

IOPS from disk read - 4600 17500 17500 255000 38000 53200

Throughput from disk - 170 MB/s 392 MB/s 392 MB/s 400 MB/s 390 MB/s 790 MB/s

Copy features3 F F n/a F F,V F,V,R F,V,R

Base/max partitions 0/16 16/16 0/16 0/16 8/64 64/64 16/64

1SATA disks 2For more than 4 connections, purchase of additional mini-hubs required 3FlashCopy; V=Volume Copy; R=Remote Volume Mirroring

Attention: There is no support, at the time of writing, for the intermix of FC and Serial ATA drawers.

Note: Always consult the IBM TotalStorage™ FAStT Interoperability Matrix for information on the latest supported Storage Manager version for your FAStT system:

http://www.storage.ibm.com/disk/fastt/supserver.htm



The Storage Manager Client contains these components:

– Enterprise Management: Use this component to add, remove, and monitor storage subsystems within the management domain.

– Subsystem Management: Use this component to manage the elements of an individual storage subsystem.

The Event Monitor is a separate program that is bundled with the Storage Manager Client. It runs in the background and can send alert notifications in the event of a critical problem.

� FAStT Storage Manager Agent:

The Storage Manager Agent provides a management conduit for the Storage Manager Client to configure and monitor the subsystem through the Fibre Channel I/O path. The agent also provides local or remote access to the Storage Manager Client depending on whether the client is installed on the host, or in a network management station over the TCP/IP network.

The Fibre Channel link to the FAStT system only uses SCSI commands. The client GUI front end only uses TCP/IP commands. The agent is the piece of software that sits between these two components and translates from TCP/IP to SCSI and back again, so we can use the client to control a directly attached FAStT system.

Also, the FAStT storage can be managed in-band (through Fibre Channel) or out-of-band (through direct network, Ethernet). Both management methods can be used simultaneously. If both connections are used, out-of-band management is the default connection with in-band as the alternate (backup) method.

� Redundant Disk Array Controller (RDAC):

The RDAC component contains a multipath driver and hot-add support. It must be installed on the host system, and it will provide redundant paths to the Storage Server when both RAID controllers are installed. If a RAID controller fails or becomes inaccessible due to connectivity problems, RDAC reroutes the I/O requests through another RAID controller. The hot-add part of RDAC enables you to register new logical drives to the operating system dynamically.

Some operating systems do not use RDAC; they have their own multipath drivers.

� FAStT Utilities:

The FAStT Utilities package contains two command line tools: hot_add and SMdevices. With the hot_add utility, the operating system can detect new logical drives without rebooting the host system. When you run the utility, it will re-scan the host bus adapters and handle the operating system assignments of all new devices found.

The SMdevices utility lists all the logical drives, World Wide Names (WWNs), and the storage subsystem that it can access. This utility is mainly used for troubleshooting, because it provides a basic check of the Storage Server setup and Fibre Channel (FC) connectivity.

1.2.2 New features introduced in Storage Manager Version 8.4 The current version at the time of writing is Version 8.42. (This release supports all FAStT Storage Servers with general released firmware versions from 04.00.02.xx up to 05.42.xx.xx.

Important: For operating systems that require RDAC, RDAC must be loaded, even if you have only one host bus adapter (HBA) in the host.


Note that specific FAStT models, like the FAStT100 with SATA disks, require specific firmware levels.)

The major new features introduced into Storage Manager 8.4 include these:

� Persistent reservations:

Persistent reservations is a SCSI-3 feature for restricting access to storage media, based on the concept of reservations that the host can establish and manipulate. Earlier versions of SCSI provide a simple reservation capability through the RESERVE and RELEASE commands. SCSI-3 persistent reservations provide a significant super-set of the earlier capability. Improvements that come with persistent reservations include:

– Well-defined model for reserving across multiple host and target ports– Levels of access control, for example, shared reads, exclusive writes, exclusive reads,

and writes– Ability to query the storage system about registered ports and reservations– Provisions for persistence of reservations through power loss at the storage system

Persistent reservations, which are configured and managed through the cluster server software, preserve logical drive reservations and registrations and prevent other hosts from accessing the logical drive.

Persistent reservations are allowed on a primary logical drive in a Remote Mirror, but are not allowed on a secondary logical drive. If a logical drive has any type of reservation when designated as a secondary logical drive, the primary logical drive detects a reservation conflict at its first write request to the secondary logical drive and clears the reservation automatically. Subsequent requests to place a reservation on the secondary logical drive are rejected.

� VolumeCopy:

The VolumeCopy feature is a firmware-based mechanism for replicating array data within a controller module (FAStT). This feature is designed as a system management tool for tasks, such as relocating data to other drives for hardware upgrades or performance management, data backup, and restoring FlashCopy logical drive data. This premium feature includes a Create Copy Wizard to assist in creating a logical drive (volume) copy, and a Copy Manager, to monitor logical drive copies after they have been created.

The VolumeCopy premium feature must be enabled by purchasing a feature key file from IBM.

Some applications for the VolumeCopy include these:

– Copying data for greater access:

As your storage requirements for a logical drive change, the VolumeCopy feature can be used to copy data to a logical drive in an array that uses larger capacity disk drives within the same storage subsystem. This provides an opportunity to move data to larger drives (for example, 73 GB to 146 GB), change to drives with a higher data transfer rate (for example, 1 Gbps to 2 Gbps), or to change to drives using new technologies for higher performance.

– Backing up data:

With the VolumeCopy feature, you can create a backup of a logical drive by copying data from one logical drive to another logical drive in the same storage subsystem. The target logical drive can be used as a backup for the source logical drive, for system testing, or to back up to another device, such as a tape drive.


– Restoring FlashCopy logical drive data to the base logical drive:

If you need to restore data to the base logical drive from its associated FlashCopy logical drive, the VolumeCopy premium feature can be used to copy the data from the FlashCopy logical drive to the base logical drive. You can create a logical drive copy of the data on the FlashCopy logical drive, and then copy the data to the base logical drive.

� Increase to 256 LUNs (logical drives) per storage partition (from 32):

The 256 LUN support allows the storage array to present up to 256 host-addressable LUNs (numbered 0-255) to a given host port, providing greater connectivity and storage capacity for SAN environments. This capability is a fundamental attribute of the IBM TotalStorage FAStT product and will be present on any array executing a firmware revision level that supports this feature.

This feature includes:

– Increased command queue depth. The term “queue depth” refers to how many outstanding I/O requests will be sent a given drive. The maximum queue depth is now 16 on FAStT600 Turbo and FAStT900 (the maximum remains at 8 for the FAStT700).

– Increased number of Fibre Channel logins (now host ports 512).

Here are some other helpful features:

� Array size increase:

Previous versions of Storage Manager supported up to 30 disk drives per array but with a 2 TB boundary for the overall array capacity. In other words, using disks of 146 GB authorized a maximum of only 14 disks per array or (28 disks of a 73 GB capacity each). In Version 8.4, the limit of 30 disk drives remains, but not the 2 TB boundary.

The benefit of this improvement is improved IOPS performance, because you can have more spindles per logical drive (note that the maximum logical drive size remains at 2 TB).

� Supporting Veritas DMP for Solaris:

This provides the ability to have a multipathing solution on Solaris using Sun Fibre Channel host adapters, extend DMP support to OEM storage arrays, and work with DMP without using Sun T3 emulation.

� Increased number of Fibre Channel logins:

The number of Fibre Channel logins per controller port is now equal to the maximum of host ports that can be defined (a maximum of 512 for FAStT700 and FAStT900).

This increase enables more host-to-storage I/O paths, which is critical for storage consolidation environments and to ensure there is always an access to all defined hosts.

� Increased large I/O size:

If the transfer length specified for a host read or write operation exceeds a predetermined size, the controller might break the I/O operation down into smaller, more manageable steps. This predetermined size is referred to as the large I/O size. In this release, the large I/O size is 2 MB for all logical drives. The large I/O size is controller platform independent.

Note: Most hosts will be able to have 256 LUNs mapped per storage partition. Microsoft Windows NT®, Sun Solaris with RDAC, NetWare 5.1, and HP-UX 11.0 are restricted to 32 LUNs. If you try to map a logical drive to a LUN that is greater than 32 on these operating systems, the host will be unable to access it. Solaris requires the use of Veritas DMP for failover for 256 LUNs. Consult the IBM TotalStorage FAStT Storage Manager User Guide v8.4 for details and up-to-date information.


� Increased queue depth:

Each controller in the storage subsystem manages input/output (I/O) operations or requests for each drive. The term queue depth refers to how many outstanding I/O requests can be sent to a given drive. A higher queue depth increases the performance of applications with small-block, random I/O activity, because it can boost the IOPS of the drive.

The queue depth is now up to a maximum of 16 for the FAStT600 Turbo and FAStT900 (it remains at 8 for the FAStT700).

� Auto logical Drive Transfer (ADT) alert notification:

Previous code did not provide alert notification on ADT-induced logical drive ownership changes. This enhancement is intended to remedy that situation. The logical drive transfer alert notification is issued for any instance of a logical drive owned by a non-preferred controller, whether ADT is enabled or not, and is in addition to any informational or critical event already logged within the ADT or RDAC context. Note that whenever a logical drive-not-on-preferred-path condition occurs, a needs-attention condition will be raised immediately; only the alert notification is delayed.

� Failover alert delay:

The failover alert delay lets you delay the logging of a critical event if the multipath driver transfers logical drives to the non-preferred controller. If the transfer occurs within the specified delay period, no critical event is logged. If the transfer exceeds this delay period, a logical drive-not-on-preferred-path alert is issued as a critical event (refer to 2.4.4, “ADT alert notification” on page 53 for more details).

� User control of network parameters:

Users are able to modify certain network parameter settings for each storage controller in the array. The modifiable parameters are controller IP address, gateway IP address, network submask address, and BOOTP enabled or disabled. Changes made to these parameters go into effect immediately, without any need for a controller reboot or reset to make them take effect. They are persistent; they are saved in both NVSRAM and DACstore and will remain in effect across controller reboots and resets until subsequently modified by the user. They are automatically propagated to a replacement controller.

� Recovery from intermittent drive path errors:

This feature improves data availability by having the storage array controller preemptively switch drive I/O operations from the preferred drive channel to the alternate drive channel when intermittent errors occur that prevent an I/O operation to a drive from being successfully completed.

� Host software improvements:

The improvements include major event log GUI enhancements, a dialog box for deleting multiple volumes, and Windows installation enhancements (version numbers for Add/Remove and automatic deletion and installation).

� Selected enhancements:

– Recovery Profile: An append-only file that can be used by IBM technical support for troubleshooting FAStT issues.

– Additional warnings/states:

• Added new battery status (charging or not present).• Warning: out of sync clocks (mechanisms to synchronize, and to display the date

and time, are also provided).• Critical events for loss of drive path redundancy.

– Automatic save of the Read Link Status Diagnostic (RLS) data.


1.3 Introduction to SANWith the evolution of information technology (IT) and the Internet, there has been a large demand for data management, as well as a rapid increase of data capacity requirements.

For businesses, data access is critical and requires performance, availability, and flexibility. In other words, there is a need for a data access network that is fast, redundant (multipath), easy to manage, and always available. That network is a Storage Area Network (SAN).

A SAN is a high-speed network that enables the establishment of direct connections between storage devices and hosts (servers) within the distance supported by Fibre Channel.

The SAN can be viewed as an extension of the storage bus concept, which enables storage devices to be interconnected using concepts similar to that of local area networks (LANs) and wide area networks (WANs). A SAN can be shared between servers or dedicated to one server, or both. It can be local or extended over geographical distances.

The diagram in Figure 1-3 shows a brief overview of a SAN connecting multiple servers to multiple storage systems.

Figure 1-3 What is a SAN?

SANs create new methods of attaching storage to servers. These new methods can enable great improvements in availability, flexibility, and performance. Today’s SANs are used to connect shared storage arrays and tape libraries to multiple servers, and are used by clustered servers for failover. A big advantage of SANs is the sharing of devices among heterogeneous hosts.

Network & LANs

Storage Area Network

zSeriesiSeries UNIXpSeriesWindows

TapeESS

Client

FAStT

Client ClientClient Client

Client

Tape Libr.

Client


1.3.1 SAN componentsIn this section, we present a brief overview of the basic SAN storage concepts and building blocks.

Figure 1-4 SAN components

SAN serversThe server infrastructure is the underlying reason for all SAN solutions. This infrastructure includes a mix of server platforms, such as Microsoft Windows, UNIX® (and its various flavors), and IBM z/OS®.

SAN storageThe storage infrastructure is the foundation on which information relies, and therefore, must support a company’s business objectives and business model. In this environment, simply deploying more and faster storage devices is not enough. A SAN infrastructure provides enhanced network availability, data accessibility, and system manageability. It is important to remember that a good SAN begins with a good design. The SAN liberates the storage device, so it is not on a particular server bus, and attaches it directly to the network. In other words, storage is externalized and can be functionally distributed across the organization. The SAN also enables the centralization of storage devices and the clustering of servers, which has the potential to make for easier and less expensive centralized administration that lowers the total cost of ownership (TCO).

Fibre Channel Today, Fibre Channel (FC) is the architecture on which most SAN implementations are built. Fibre Channel is a technology standard that enables data to be transferred from one network node to another at very high speeds. Current implementations transfer data at 1 Gbps or 2 Gbps (10 Gbps data rates have already been tested).

Shared Storage Devices

SAN Components

Heterogeneous Servers

Bridge

Hub

SwitchDirector

zSeries

Windows

iSeries

UNIX

pSeries

Tape

ESS

FAStT SSA

JBOD

Linux


Fibre Channel was developed through industry cooperation, unlike SCSI, which was developed by a vendor, and submitted for standardization after the fact.

Some people refer to Fibre Channel architecture as the Fibre version of SCSI. Fibre Channel is an architecture used to carry IPI traffic, IP traffic, FICON® traffic, FCP (SCSI) traffic, and possibly traffic using other protocols, all on the standard FC transport. An analogy could be Ethernet, where IP, NETBIOS, and SNA are all used simultaneously over a single Ethernet adapter, because these are all protocols with mappings to Ethernet. Similarly, there are many protocols mapped onto FC.

SAN topologiesFibre Channel interconnects nodes using three physical topologies that can have variants. These three topologies are:

� Point-to-point: The point-to-point topology consists of a single connection between two nodes. All the bandwidth is dedicated to these two nodes.

� Loop: In the loop topology, the bandwidth is shared between all the nodes connected to the loop. The loop can be wired node-to-node; however, if a node fails or is not powered on, the loop is out of operation. This is overcome by using a hub. A hub opens the loop when a new node is connected, and closes it when a node disconnects.

� Switched or fabric: A switch enables multiple concurrent connections between nodes. There are two types of switches: circuit switches and frame switches. Circuit switches establish a dedicated connection between two nodes, whereas frame switches route frames between nodes and establish the connection only when needed. This is also known as switched fabric.

SAN interconnectsFibre Channel employs a fabric to connect devices. A fabric can be as simple as a single cable connecting two devices. However, the term is most often used to describe a more complex network using cables and interface connectors, host bus adapters (HBAs), extenders, and switches.

Fibre Channel switches function in a manner similar to traditional network switches to provide increased bandwidth, scalable performance, an increased number of devices, and in some cases, increased redundancy. Fibre Channel switches vary from simple edge switches to enterprise-scalable core switches or Fibre Channel directors.

Inter-Switch Links (ISLs)Switches can be linked together using either standard connections or Inter-Switch Links. Under normal circumstances, traffic moves around a SAN using the Fabric Shortest Path First (FSPF) protocol. This allows data to move around a SAN from initiator to target using the quickest of alternate routes. However, it is possible to implement a direct, high-speed path between switches in the form of ISLs.

TrunkingInter-Switch Links can be combined into logical groups to form trunks. In IBM TotalStorage switches, trunks can be groups of up to four ports on a switch connected to four ports on a second switch. At the outset, a trunk master is defined, and subsequent trunk slaves can be added. This has the effect of aggregating the throughput across all links. Therefore, in the case of switches with 2 Gbps ports, we can trunk up to four ports, allowing for an 8 Gbps Inter-Switch Link.

Note: The fabric (or switched) topology gives the most flexibility and ability to grow your installation for future needs.


Chapter 2. FAStT planning tasks

Careful planning is essential to any new storage installation.

Choosing the right equipment and software, and also knowing what the right settings are for your installation, can be challenging. Every installation has to answer these questions and accommodate specific requirements, and there can be numerous variations in the solution.

Well-thought design and planning prior to the implementation will help you get the most of your investment for the present and protect it for the future.

During the planning process, you need to answer numerous questions about your environment:

� What are my SAN requirements?� What hardware do I need to buy?� What reliability do I require?� What redundancy do I need? (for example, do I need off-site mirroring?)� What compatibility issues do I need to address?� What operating system am I going to use (existing or new installation)?� What applications will access the storage subsystem?� What are these applications’ hardware and software requirements?� What will be the physical layout of the installation? Only local site, or remote sites as well?� What performance do I need?� How much does it cost?

This list of questions is not exhaustive, and as you can see, some go beyond simply configuring the FAStT Storage Server.

This chapter provides guidelines to help you in the planning process.

Some recommendations in this chapter come directly from experience with various FAStT installations at customer sites.

2


2.1 Planning your SANWhen setting up a Storage Area Network, you want it to not only answer your current requirements, but also be able to fulfill future needs. First, your SAN should be able to accommodate a growing demand in storage (it is estimated that storage need doubles every two years). Second, your SAN must be able to keep up with the constant evolution of technology and resulting hardware upgrades and improvements. It is estimated that you will have to upgrade your storage installation every two to three years.

Compatibility among different pieces of equipment is crucial when planning your installation. The important question is what device works with what, and also who has tested and certified (desirable) what equipment.

When designing a SAN storage solution, it is good practice to complete the following steps:

1. Produce a statement that outlines the solution requirements that can be used to determine the type of configuration you need. It should also be used to cross-check that the solution design delivers the basic requirements. The statement should have easily defined bullet points covering the requirements, for example:

– Required capacity– Required redundancy levels– Backup and restore windows– Type of data protection needed– Network backups– LAN-free backups– Serverless backups– FlashCopy– Remote Volume Mirroring– Host and operating system types to be connected to SAN– Number of host connections required

2. Produce a hardware checklist. It should cover such items that require you to:

– Ensure that the minimum hardware requirements are met.

– Make a complete list of the hardware requirements, including the required premium options.

– Ensure your primary and secondary storage subsystems are properly configured.

– Ensure that your Fibre Channel switches and cables are properly configured. The Remote Mirroring links must be in a separate zone.

3. Produce a software checklist to cover all the required items that need to be certified and checked. It should include such items that require you to:

– Ensure that data on the primary and secondary storage subsystems participating in Remote Volume Mirroring is backed up.

– Ensure that the correct versions of firmware and storage-management software are installed.

– Ensure that the Remote Volume Mirror option is enabled on both the primary and secondary storage subsystems.

– Ensure that the Remote Volume Mirror option is activated and that a mirror repository logical drive is created for each controller in all participating storage subsystems.

– Ensure that the required primary and secondary logical drives are created on the primary and remote storage subsystems.


For more complete information regarding Storage Area Networks, refer to the following IBM Redbooks:

� Introduction to Storage Area Networks, SG24-5470� IBM SAN Survival Guide, SG24-6143

2.2 SAN zoningA zone is a group of fabric-connected devices arranged into a specified grouping. Zones can vary in size depending on the number of fabric-connected devices, and devices can belong to more than one zone.

Typically, you use zones to do the following tasks:

� Provide security: Use zones to provide controlled access to fabric segments and to establish barriers between operating environments, for example, isolate systems with different uses or protect systems in a heterogeneous environment.

� Customize environments: Use zones to create logical subsets of the fabric to accommodate closed user groups or to create functional areas within the fabric, for example, include selected devices within a zone for the exclusive use of zone members, or create separate test or maintenance areas within the fabric.

� Optimize IT resources: Use zones to consolidate equipment logically for IT efficiency, or to facilitate time-sensitive functions, for example, create a temporary zone to back up non-member devices.

Figure 2-1 shows three zones with some overlaps.

Without zoning, failing devices that are no longer following the defined rules of fabric behavior might attempt to interact with other devices in the fabric. This type of event would be similar to an Ethernet device causing broadcast storms or collisions on the whole network, instead of being restricted to one single segment or switch port. With zoning, these failing devices cannot affect devices outside of their zone.

Note: Utilizing zoning is always a good idea with SANs that include more than one host. With SANs that include more than one operating system, or SANs that contain both tape and disk devices, it is mandatory.

Chapter 2. FAStT planning tasks 15

Figure 2-1 SAN zoning

2.2.1 Zone typesA zone member can be specified using one of the following notations:

� Node World Wide Name� Port World Wide Name� Physical Fabric port number

The following list describes the zones types:

Port level zone A zone containing members specified by switch ports (domain ID, port number) only. Port level zoning is enforced by hardware in the switch.

WWN zone A zone containing members specified by device World Wide Name (WWN) only. WWN zones are hardware enforced in the switch.

Mixed zone A zone containing some members specified by WWN and some members specified by switch port. Mixed zones are software enforced through the fabric name server.

Tip: It is often debated whether one should use zoning to enable the sharing of HBAs for disk storage and tape connectivity. We can confidently state that this should not be done. It is possible to make it work, but due to HBA limitations, it cannot provide reliable operation at this time, and failures are impossible to accurately predict and difficult to diagnose. Disk and tape should be on separate HBAs and also should have different zones. With UNIX systems supported by the FAStT, HBA sharing is explicitly forbidden by the FAStT documentation, and it is simply an extremely bad idea in other environments.

For systems such as the IBM BladeCenter servers that have a limited number of FC ports available, it is suggested that you instead perform a LAN backup instead of a LAN-free backup directly to the tape drives.

Green zone

Blue zone

Red zone

Fibre Channel Fabric

NTServer

Server

AIXServer

Tape Library

Disk subsystem

JBOD

Loop 1FAStT

ManagedHub


Zones can be hardware enforced or software enforced:

� In a hardware-enforced zone, zone members can be specified by physical port number, or in recent switch models, through WWN, but not within the same zone.

� A software-enforced zone is created when a port member and WWN members are in the same zone.

2.2.2 Zoning configurationZoning is not that hard to understand or configure. Using your switch’s management software, use WWN zoning to set up each zone so that it contains one server port, and whatever storage device ports that host port requires access to. You do not need to create a separate zone for each source/destination pair. Do not put disk and tape in the same zone.

When configuring WWN-based zoning, it is important to always use the Port WWN, not the Node WWN. With many systems, the Node WWN is based on the Port WWN of the first adapter detected by the HBA driver. If the adapter the Node WWN was based on were to fail, and you based your zoning on the Node WWN, your zoning configuration would become invalid. Subsequently the host with the failing adapter would completely lose access to the storage attached to that switch.

Keep in mind that you will need to update the zoning information, should you ever need to replace a Fibre Channel adapter in one of your servers. Most storage systems such as the FastT, Enterprise Storage Server®, and IBM Tape Libraries have a WWN tied to the Vital Product Data of the system unit, so individual parts may usually be replaced with no effect on zoning.

For more details on configuring zoning with your particular switch, see the IBM Redbook Implementing an Open IBM SAN, SG24-6116.

2.3 Physical components and characteristicsIn this section, we review elements related to physical characteristics of an installation, such as fibre cables, Fibre Channel adapters, and other elements related to the structure of the storage system and disks, including arrays, controller ownership, segment size, storage partitioning, caching, hot-spare drives, and Remote Volume Mirroring.

2.3.1 Rack considerationsThe FAStT and possible expansions are mounted in rack enclosures.

General planningConsider the following general planning guidelines:

� Determine:

– The size of the floor area required by the equipment:• Floor-load capacity• Space needed for expansion• Location of columns

– The power and environmental requirements.

Note: You do not explicitly specify a type of enforcement for a zone. The type of zone enforcement (hardware or software) depends on the type of member it contains (WWNs and/or ports).


Create a floor plan to check for clearance problems. Be sure to include the following considerations on the layout plan:

� Service clearances required for each rack or suite of racks.

� If the equipment is on a raised floor:

– Things that might obstruct cable routing– The height of the raised floor

� If the equipment is not on a raised floor:

– The placement of cables to minimize obstruction– If the cable routing is indirectly between racks (such as along walls or suspended), the

amount of additional cable

� Location of:

– Power receptacles– Air conditioning equipment and controls– File cabinets, desks, and other office equipment– Room emergency power-off controls– All entrances, exits, windows, columns, and pillars

� Make a full-scale template (if necessary) of the rack and carry it along the access route to check for potential clearance problems through doorways and passage ways, around corners, and in elevators.

� Store all spare materials that can burn in properly designed and protected areas.

Rack layoutTo be sure you have enough space for the racks, create a floor plan before installing the racks. You might need to prepare and analyze several layouts before choosing the final plan.

If you are installing the racks in two or more stages, prepare a separate layout for each stage.

Consider the following things when you make a layout:

� The flow of work and personnel within the area.� Operator access to units, as required.� If the rack is on a raised floor:

– Position it over a cooling register. – The bottom of the rack is open to facilitate cooling.

� If the rack is not on a raised floor:– The maximum cable lengths– The need for cable guards, ramps, etc. to protect equipment and personnel.

� Location of any planned safety equipment.� Expansion.

Review the final layout to ensure that cable lengths are not too long and that the racks have enough clearance.

You need at least 152 cm (60 in.) of space between 42-U rack suites. This space is necessary for opening the front and rear doors and for installing and servicing the rack. It also allows air circulation for cooling the equipment in the rack. All vertical rack measurements are given in rack units (U). One U is equal to 4.45 cm (1.75 in.). The U levels are marked on labels on one front mounting rail and one rear mounting rail. Figure 2-2 shows an example of the required service clearances for a 9306-900 42U rack.


Figure 2-2 9306 Enterprise rack space requirements

2.3.2 Cables and connectorsIn this section, we discuss some essential characteristics of fibre cables and connectors. This should help you understand options you have for connecting your FAStT, with or without a SAN.

Cable types (shortwave or longwave)Fiber cables are basically available in multi-mode fiber (MMF) or single-mode fiber (SMF).

Multi-mode fiber allows light to disperse in the fiber so that it takes many different paths, bouncing off the edge of the fiber repeatedly to finally get to the other end (multi-mode means multiple paths for the light). The light taking these different paths gets to the other end of the cable at slightly different times (different path, different distance, different time). The receiver has to determine which signals go together as they all come flowing in.

The maximum distance is limited by how “blurry” the original signal has become. The thinner the glass, the less the signals “spread out,” and the further you can go and still determine what is what on the receiving end. This dispersion (called modal dispersion) is the critical factor in determining the maximum distance a high-speed signal can go. It is more relevant than the attenuation of the signal (from an engineering standpoint, it is easy enough to increase the power level of the transmitter or the sensitivity of your receiver, or both, but too much dispersion cannot be decoded no matter how strong the incoming signals are).


There are two different core sizes of multi-mode cabling available: 50 micron and 62.5 micron. The intermixing of the two different core sizes can produce unpredictable and unreliable operation. Therefore, core size mixing is not supported by IBM. Users with an existing optical fibre infrastructure are advised to ensure it meets Fiber Channel specifications and is a consistent size between pairs of FC transceivers.

Single-mode fiber (SMF) is so thin (9 microns) that the light can barely “squeeze” through and it tunnels through the center of the fiber using only one path (or mode). This behavior can be explained (although not simply) through the laws of optics and physics. The result is that because there is only one path that the light takes to the receiver, there is no “dispersion confusion” at the receiver. However, the concern with single mode fiber is attenuation of the signal.

Table 2-1 lists the supported distances.

Table 2-1 Cable type overview

Note that the “maximum distance” shown in Table 2-1 is just that, a maximum. Low quality fiber, poor terminations, excessive numbers of patch panels, etc. can cause these maximums to be far shorter. Ensure that your cabling meets Fibre Channel standards, as detailed in the ANSI specifications. The cabling specifications are in Chapter 8 and Annex C and E of the FC-PH and FC-PH-2 standards, available at:

http://www.t11.org

All IBM fiber feature codes orderable with the FAStT meet the standards.

Interfaces, connectors, and adaptersIn Fibre Channel technology, frames are moved from source to destination using gigabit transport, which is a requirement to achieve fast transfer rates. To communicate with gigabit transport, both sides have to support this type of communication. This is accomplished by using specially designed interfaces that can convert other communication transport into gigabit transport.

The interfaces that are used to convert the internal communication transport of gigabit transport are, depending on the FAStT model either Small Form Factor Transceivers (SFF), also often called Small Form Pluggable (SFP) or Gigabit Interface Converters (GBIC). See Figure 2-3.

Fiber type Speed Maximum distance

9 micron SMF (longwave) 1 Gbps 10 km

9 micron SMF (longwave) 2 Gbps 2 km

50 micron MMF (shortwave) 1 Gbps 500 m

50 micron MMF (shortwave) 2 Gbps 300 m

62.5 micron MMF (shortwave) 1 Gbps 175 m/300 m

62.5 micron MMF (shortwave) 2 Gbps 90 m/150 m


http://www.t11.org

Figure 2-3 Small Form Factor Transceiver (left) and GBIC (right)

The FASt600, FAStT700, EXP700, and FAStT900 require SFP transceivers; the FAStT200, FAStT500, and EXP500 require GBICs.

Obviously, the particular connectors used to connect a fiber to a component will depend upon the receptacle into which they are being plugged.

SC connectorThe duplex SC connector is a low loss, push/pull fitting connector. It is easy to configure and replace. The two fibers each have their own part of the connector. The connector is keyed to ensure correct polarization when connected, that is, transmit to receive and vice-versa.

The mini-hubs in FAStT500 and EXP500 use SC connectors (Figure 2-4).

Figure 2-4 Dual SC fiber-optic plug connector

LC connectorThe type of connectors which plug into SFF or SFP devices are called LC connectors. Again a duplex version is used so that the transmit and receive are connected in one step.

The main advantage that these LC connectors have over the SC connectors is that they are of a smaller form factor and so manufacturers of Fibre Channel components are able to provide more connections in the same amount of space.

The FAStT600, FAStT700, FAStT900 and EXP700 use LC connectors (Figure 2-5).


Figure 2-5 SFF hot-pluggable transceiver (SFP) with LC connector fiber cable

Adapter cableThe LC-SC adapter cable attaches to the end of an LC-LC cable to support SC device connections. A combination of one LC/LC fiber cable and one LC/SC adapter cable is required for each connection. This is used to connect from some of the older 1 Gb/s devices to a 2 Gb/s capable and LC interface-based SAN (Figure 2-6).

Figure 2-6 Fibre Channel LC-SC adapter cable

Interoperability of 1 Gbps and 2 Gbps devicesThe Fibre Channel standard specifies a procedure for speed auto-detection. Therefore, if a 2 Gbps port on a switch or device is connected to a 1 Gbps port, it may negotiate down and would run the link at 1 Gbps. If there are two 2 Gbps ports on either end of a link, the negotiation runs the link at 2 Gbps if the link is up to specifications. A link that is too long or “dirty” could end up running at 1 Gbps even with 2 Gbps ports at either end, so watch your distances and make sure your fiber is good.


Not all ports are capable of auto-negotiation. For example, the ports on the FAStT 700 and EXP 700 must be manually set to 1 or 2 Gbps. (These products ship pre-set to 2 Gbps.)

2.3.3 Cable management and labeling Cable management and labeling for solutions using racks, n-node clustering, and Fibre Channel are increasingly important in open systems solutions. Cable management and labeling needs have expanded from the traditional labeling of network connections to management and labeling of most cable connections between your servers, disk subsystems, multiple network connections, and power and video subsystems. Examples of solutions include Fibre Channel configurations, n-node cluster solutions, multiple unique solutions located in the same rack or across multiple racks, and solutions where components might not be physically located in the same room, building, or site.

Why is more detailed cable management required?The necessity for detailed cable management and labeling is due to the complexity of today's configurations, potential distances between solution components, and the increased number of cable connections required to attach additional value-add computer components. Benefits from more detailed cable management and labeling include ease of installation, ongoing solutions/systems management, and increased serviceability.

Solutions installation and ongoing management are easier to achieve when your solution is correctly and consistently labeled. Labeling helps make it possible to know what system you are installing or managing, for example, when it is necessary to access the CD-ROM of a particular system, and you are working from a centralized management console. It is also helpful to be able to visualize where each server is when completing custom configuration tasks such as node naming and assigning IP addresses.

Cable management and labeling improve service and support by reducing problem determination time, ensuring the correct cable is disconnected when necessary. Labels will assist in quickly identifying which cable needs to be removed when connected to a device such as a hub that might have multiple connections of the same cable type. Labels also help identify which cable to remove from a component. This is especially important when a cable connects two components that are not in the same rack, room, or even the same site.

Cable planningSuccessful cable management planning includes three basic activities: site planning (before your solution is installed), cable routing, and cable labeling.

Site planningAdequate site planning completed before your solution is installed will result in a reduced chance of installation problems. Significant attributes covered by site planning are location specifications, electrical considerations, raised/non-raised floor determinations, and determination of cable lengths. Consult the documentation of your solution for special site planning considerations. IBM Netfinity® Racks document site planning information in the IBM Netfinity Rack Planning and Installation Guide, part number 24L8055.


Cable routingWith effective cable routing, you can keep your solution's cables organized, reduce the risk of damaging cables, and allow for affective service and support. To assist with cable routing, IBM recommends the following guidelines:

� When installing cables to devices mounted on sliding rails:

– Run the cables neatly along equipment cable-management arms and tie the cables to the arms. (Obtain the cable ties locally.)

– Take particular care when attaching fiber optic cables to the rack. Refer to the instructions included with your fiber optic cables for guidance on minimum radius, handling, and care of fiber optic cables.

– Run the cables neatly along the rack rear corner posts.

– Use cable ties to secure the cables to the corner posts.

– Make sure the cables cannot be pinched or cut by the rack rear door

– Run internal cables that connect devices in adjoining racks through the open rack sides.

– Run external cables through the open rack bottom.

– Leave enough slack so that the device can be fully extended without putting a strain on the cables.

– Tie the cables so that the device can be retracted without pinching or cutting the cables.

� To avoid damage to your fiber-optic cables, follow these guidelines:

– Use great care when utilizing cable management arms.

– When attaching to a device on slides, leave enough slack in the cable so that it does not bend to a radius smaller than 76 mm (3 in.) when extended or become pinched when retracted.

– Route the cable away from places where it can be snagged by other devices in the rack.

– Do not overtighten the cable straps or bend the cables to a radius smaller than 76 mm (3 in.).

– Do not put excess weight on the cable at the connection point and be sure that it is well supported. For instance, a cable that goes from the top of the rack to the bottom must have some method of support other than the strain relief boots built into the cable.

Additional information for routing cables with IBM Netfinity Rack products can be found in the IBM Netfinity Rack Planning and Installation Guide, part number 24L8055. This publication includes pictures providing more details about the recommended cable routing.

Cable labelingWhen labeling your solution, follow these tips:

� As you install cables in the rack, label each cable with appropriate identification.

� Remember to attach labels to any cables you replace.

� Document deviations from the label scheme you use. Keep a copy with your Change Control Log book.

Note: Do not use cable-management arms for Fibre Cables.


Whether using a simple or complex scheme, the label should always implement a format including these attributes:

� Function (optional)

� Location information should be broad to specific (for example, the site/building to a specific port on a server or hub).

� The optional Function row helps identify the purpose of the cable (that is, Ethernet versus token-ring or between multiple networks).

Other cabling mistakesSome of the most common mistakes include these:

� Leaving cables hanging from connections with no support.� Not using dust caps.� Not keeping connectors clean. (Some cable manufacturers require the use of lint-free

alcohol wipes in order to maintain the cable warranty.)� Leaving cables on the floor where people might kick or trip over them.� Not removing old cables when they are no longer needed, nor planned for future use.

2.3.4 Fibre Channel adaptersWe now review two topics related to Fibre Channel adapters:

� Placement on the host system bus� Distributing the load among several adapters

Host system busToday, there is a choice of high-speed adapters for connecting disk drives. Fast adapters can provide better performance, but you must be careful not to put all the high-speed adapters on a single system bus. Otherwise, the computer bus becomes the performance bottleneck.

It is recommended to distribute high-speed adapters across several busses. When you use PCI adapters, make sure you first review your system specifications. Some systems include a PCI adapter placement guide.

The number of adapters you can install depends on the number of PCI slots available on your server, but also on what traffic volume you expect on your SAN.

Do you want only failover capabilities on the storage side (one HBA, two paths), or do you want to share the workload and have fully redundant path failover with multiple adapters and over multiple paths? In general, all operating systems support two paths to the FAStT Storage Server. Microsoft Windows 2000 and Windows 2003 support up to four paths to the storage controller.

As illustrated in Figure 2-7, AIX can also support four paths to the controller, provided there are two partitions accessed within the FAStT subsystem. You can configure up to two HBAs per partition and up to two partitions per FAStT storage server (you cannot have four paths to the same partition at the time of writing).

Important: Do not to place all the high-speed Host Bus Adapters (HBA) on a single system bus.


Figure 2-7 AIX: Four paths to FAStT

Load sharingWhen we talk about load sharing, we mean that the I/O requests are equally distributed among the adapters. This can be achieved by assigning the LUNs to FAStT controllers A and B alternatively (see Section 2.3.6, “Logical drives and controller ownership” on page 35).

Figure 2-8 shows the principle for a load sharing setup (Windows environment). Windows is the only operating system where a kind of “forced” load sharing happens. IBM Redundant Disk Array Controller (RDAC) checks all available paths to the controller. In Figure 2-8, that would be four paths (blue zone). RDAC now forces the data down all paths in a “round robin” scheme, as explained in “Load balancing with RDAC (round robin)” on page 52. That means it does not really check for the workload on a single path but moves the data down in a “rotational manner” (round-robin).

Figure 2-8 Load sharing approach for multiple HBAs

Server Side Switch Storage Side

LUN 1

lun 2

LUN 1

lun 2

Red Zone

Blue Zone


In a single server environment, AIX is the other OS that allows load sharing (also called load balancing). You can set the load balancing parameter to yes. In case of heavy workload on one path, the driver moves other LUNs to the controller with less workload, and if the workload reduces, back to the preferred controller. A problem that can occur is disk thrashing. This means that the driver moves the LUN back and forth from one controller to the other. As a result, the controller is more occupied by moving disks around than servicing I/O. The recommendation is not to load balance on an AIX system. The performance increase is minimal (or performance could actually get worse).

2.3.5 Planning your storage structure and performanceAn important question and primary concern for most users is how to configure the storage subsystem for the best performance. There is no simple answer, no best guideline for storage performance optimization that is valid in every environment and for every particular situation.Furthermore, it is crucial to keep in mind that storage is just a piece of the overall solution. Other components of the solution, as represented in Figure 2-9, can be responsible for bad performance. Tuning the whole system is an iterative process. Every time you make a change to one of the components, you must re-evaluate all of the other components.

Figure 2-9 Iterative steps for system performance tuning

Here, we only cover the disk I/O subsystem.

In particular, we review and discuss the RAID levels, array size, array configuration, and channel protection.

An array is a set of drives that the system logically groups together to provide one or more logical drives to an application host or cluster.

Note: This setup is not supported in clustering. In a cluster environment, you need a single path to the each of the controllers (A and B) of the FAStT.

Tip: Do not use load balancing in AIX.


When defining arrays, you often have to compromise among capacity, performance, and redundancy.

We go through the different RAID levels and explain why we would chose this particular setting in this particular situation, and then you can draw your own conclusions.

RAID levelsPerformance varies based on the RAID level that is set. Use the Performance Monitor to obtain logical drive application read and write percentages.

RAID-0: For performance, but generally not recommendedRAID-0 (Figure 2-10) is also known as data striping. It is well-suited for program libraries requiring rapid loading of large tables, or more generally, applications requiring fast access to read-only data or fast writing. RAID-0 is only designed to increase performance. There is no redundancy, so any disk failures require reloading from backups. Select RAID level 0 for applications that would benefit from the increased performance capabilities of this RAID level. Never use this level for critical applications that require high availability.

Figure 2-10 RAID 0

RAID-1: For availability/good read response timeRAID-1 (Figure 2-11) is also known as disk mirroring. It is most suited to applications that require high data availability, good read response times, and where cost is a secondary issue. The response time for writes can be somewhat slower than for a single disk, depending on the write policy. The writes can either be executed in parallel for speed or serially for safety. Select RAID level 1 for applications with a high percentage of read operations and where the cost is not the major concern.

Tip: Unless you have unique requirements, it is highly recommended to let the system automatically create arrays. This usually ensures the most optimal balance between capacity, performance, and redundancy. A manual configuration will typically not have the most optimal settings.

etc.Block 3Block 0Disk 1



etc.Block 5Block 4Block 3Block 2Block 1Block 0

Logical Drive

Actualdevice

mappings

Stripeset

Host view


Because the data is mirrored, the capacity of the logical drive when assigned RAID level 1 is 50% of the array capacity.

Figure 2-11 RAID 1

Some recommendations when using RAID-1 include:

� Use RAID-1 for the disks that contain your operating system. It is a good choice, because the operating system can usually fit on one disk.

� Use RAID-1 for transaction logs. Typically, the database server transaction log can fit on one disk drive. In addition, the transaction log performs mostly sequential writes. Only rollback operations cause reads from the transaction logs. Therefore, we can achieve a high rate of performance by isolating the transaction log on its own RAID-1 array.

� Use write caching on RAID-1 arrays. Because a RAID-1 write will not complete until both writes have been done (two disks), performance of writes can be improved through the use of a write cache. When using a write cache, be sure it is battery-backed up.

RAID-3: Sequential access to large filesRAID-3 is a parallel process array mechanism, where all drives in the array operate in unison. Similar to data striping, information to be written to disk is split into chunks (a fixed amount of data), and each chunk is written out to the same physical position on separate disks (in parallel). This architecture requires parity information to be written for each stripe of data.

Performance is very good for large amounts of data, but poor for small requests because every drive is always involved, and there can be no overlapped or independent operation. It is well-suited for large data objects such as CAD/CAM or image files, or applications requiring sequential access to large data files. Select RAID-3 for applications that process large blocks of data. It provides redundancy without the high overhead incurred by mirroring in RAID-1.

Note: RAID 1 is actually implemented only as RAID 10 (described below) on the FastT products.

etc.Block 2Block 1Block 0

Logical Drive

Actualdevice

mappings

etc.Block 2Block 1Block 0Disk 1

etc.Block 2Block 1Block 0Disk 2

Mirrorset

Host view


RAID-5: High availability and fewer writes than readsRAID level 5 (Figure 2-12) stripes data and parity across all drives in the array. RAID level 5 offers both data protection and increased throughput. When you assign RAID-5 to an array, the capacity of the array is reduced by the capacity of one drive (for data-parity storage). RAID-5 gives you higher capacity than RAID-1, but RAID level 1 offers better performance.

Figure 2-12 RAID 5

RAID-5 is best used in environments requiring high availability and fewer writes than reads.

RAID-5 is good for multi-user environments, such as database or file system storage, where typical I/O size is small, and there is a high proportion of read activity. Applications with a low read percentage (write-intensive) do not perform as well on RAID-5 logical drives because of the way a controller writes data and redundancy data to the drives in a RAID-5 array. If there is a low percentage of read activity relative to write activity, consider changing the RAID level of an array for faster performance.

Use write caching on RAID-5 arrays, because RAID-5 writes will not be completed until at least two reads and two writes have occurred. The response time of writes will be improved through the use of write cache (be sure it is battery-backed up). RAID-5 arrays with caching can give as good as performance as any other RAID level, and with some workloads, the striping effect gives better performance than RAID-1.

RAID-10: Higher performance than RAID-1RAID-10 (Figure 2-13), also known in as RAID 0+1, implements block interleave data striping and mirroring. In RAID-10, data is striped across multiple disk drives, and then those drives are mirrored to another set of drives.

RAIDset


Logical Drive

Block 15 Block 10Block 5Block 0Disk 1

Parity 12-15

Block 11Block 6Block 1Disk 2

Block 12Parity 8-11

Block 7Block 2Disk 3

Block 13

Block 8Parity 4-7Block 3Disk 4

Block 14Block 9Block 4

Parity 0-3Disk 5

Host View


Figure 2-13 RAID 10

The performance of RAID-10 is approximately the same as RAID-0 for sequential I/Os. RAID-10 provides an enhanced feature for disk mirroring that stripes data and copies the data across all the drives of the array. The first stripe is the data stripe; the second stripe is the mirror (copy) of the first data stripe, but it is shifted over one drive. Because the data is mirrored, the capacity of the logical drive is 50% of the physical capacity of the hard disk drives in the array.

The recommendations for using RAID-10 are as follows:

� Use RAID-10 whenever the array experiences more than 10% writes. RAID-5 does not perform well as RAID-10 with a large number of writes.

� Use RAID-10 when performance is critical. Use write caching on RAID-10. Because RAID-10 write will not be completed until both writes have been done, write performance can be improved through the use of a write cache (be sure it is battery-backed up).

When comparing RAID-10 to RAID-5:

� RAID-10 writes a single block through two writes. RAID-5 requires two reads (read original data and parity) and two writes. Random writes are significantly faster on RAID-10.

� RAID-10 rebuilds take less time than RAID-5 rebuilds. If a real disk fails, RAID-10 rebuilds it by copying all the data on the mirrored disk to a spare. RAID-5 rebuilds a failed disk by merging the contents of the surviving disks in an array and writing the result to a spare.

RAID-10 is the best fault-tolerant solution in terms of protection and performance, but it comes at a cost. You must purchase twice the number of disks that are necessary with RAID-0.

The following note and Table 2-2 summarize this information.

Stripeset


Logical Drive

Host View

Controllerinternal

mapping

Actualdevice

mappings







Disk Array #1 Mirrorset Disk Array #2 Mirrorset Disk Array #3 Mirrorset


Table 2-2 RAID levels comparison

Future RAID levelsThere are more levels of RAID currently in development that promise to provide users with more options for capacity, performance, and redundancy. This may be accomplished through additional parity disks, different methods of computing parity, error correction algorithms, and so on.

RAID reliability considerationsAt first glance both RAID-3 and RAID-5 would appear to provide excellent protection against drive failure. With today’s high-reliability drives, it would appear unlikely that a second drive in an array would fail (causing data loss) before an initial failed drive could be replaced.

However, field experience has shown that when a RAID-3 or RAID-5 array fails, it is not usually due to two drives in the array experiencing complete failure. Instead, most failures are caused by one drive going bad, and a single block somewhere else in the array that cannot be read reliably.

Summary: Based on the respective level, RAID offers the following performance results:

– RAID-0 offers high performance, but does not provide any data redundancy. – RAID-1 offers high performance for write-intensive applications. – RAID-3 is good for large data transfers in applications, such as multimedia or

medical imaging, that write and read large sequential chunks of data. – RAID-5 is good for multi-user environments, such as database or file system

storage, where the typical I/O size is small, and there is a high proportion of read activity.

– RAID-10 offers higher performance than RAID-1 and more reliability than RAID-5

RAID Description APP Advantage Disadvantage

0 Stripes data across multiple drives.

IOPS Mbps

Performance due to parallel operation of the access.

No redundancy. One drive fails, data is lost.

1 Disk's data is mirrored to another drive.

IOPS Performance as multiple requests can be fulfilled simultaneously.

Storage costs are doubled.

10 Data is striped across multiple drives and mirrored to same number of disks.

IOPS Performance as multiple requests can be fulfilled simultaneously. Most reliable RAID level on the FASTt

Storage costs are doubled.

3 Drives operated independently with data blocks distributed among all drives. Parity is written to a dedicated drive.

Mbps High performance for large, sequentially accessed files (image, video, graphical).

Degraded performance with 8-9 I/O threads, random IOPS, smaller more numerous IOPS.

5 Drives operated independently with data and parity blocks distributed across all drives in the group.

IOPSMbps

Good for reads, small IOPS, many concurrent IOPS and random I/Os.

Writes are particularly demanding.


This problem is exacerbated by using large arrays with RAID-5. This “stripe kill” can lead to data loss when the information to re-build the stripe is not available. The end effect of this issue will of course depend on the type of data and how sensitive it is to corruption. While most storage subsystems (including the FAStT) have mechanisms in place to try to prevent this from happening, they cannot work 100% of the time.

Any selection of RAID type should take into account the cost of downtime. Simple math tells us that RAID-3 and RAID-5 are going to suffer from failures more often than RAID 10. (Exactly how often is subject to many variables and is beyond the scope of this book.) The money saved by economizing on drives can be easily overwhelmed by the business cost of a crucial application going down until it can be restored from backup.

Naturally, no data protection method is 100% reliable, and even if RAID were 100% solid, it would not protect your data from accidental corruption or deletion by program error or operator error. Therefore, all crucial data should be backed up by appropriate software, according to business needs.

Array sizeMaximum array sizes are dependent on your version of Storage Manager:

� In Storage Manager Version 8.3, you can have an array size of up to 2 TB raw space

� In Storage Manager Version 8.4, you can have an array size up to 44 TB raw space (assuming the current possible maximum of 30 disks of 143 GB capacity each). Note that the maximum size of a logical drive (LUN) is still of 2 TB.

At the time of writing, there is an upper limit of 30 disks per array, and disks in an array are not tied to any particular expansion unit or drive loop. With SM 8.3 firmware, the 2 TB raw space limitation means that when used with 73 GB drives, you are limited to a 15 drive array.

Raw space means the total space available on your disk. Depending on your RAID level, the usable space will be between 50% for RAID-1 and (N- 1)* drive capacity, where N is the number of drives for RAID-5.

Table 2-3 RAID level and performance

Tip: The first rule for the successful building of good performing storage solutions is to have enough physical space to create arrays and logical drives according to your needs.

RAID levels Data capacitya

Sequential I/O performanceb Random I/O performanceb

Read Write Read Write

Single disk n 6 6 4 4

RAID-0 n 10 10 10 10

RAID-1 n/2 7 5 6 3

RAID-5 n-1 7 7c 7 4

RAID-10 n/2 10 9 7 6

a. In the data capacity, n refers to the number of equally sized disks in the array.b. 10 = best, 1 = worst. We should only compare values within each column. Comparisonsbetween columns are not valid for this table.c. With the write back setting enabled.


Array configurationBefore you can start using the physical disk space, you must configure it. That is, you divide your (physical) disk drives into arrays and create one or more logical drives inside each array.

In simple configurations, you can use all of your drive capacity with just one array and create all of your logical drives in that unique array. However, this presents the following drawbacks:

� If you experience a (physical) drive failure, the rebuild process affects all logical drives, and the overall system performance goes down.

� Read/write operations to different logical drives are still being made to the same set of physical hard drives.

Number of drivesThe more physical drives you have per array, the shorter the access time for read and write I/O operations.

You can determine how many physical drives should be associated with a RAID controller by looking at disk transfer rates (rather than at the megabytes per second). For example, if a hard disk drive is capable of 75 nonsequential (random) I/Os per second, about 26 hard disk drives working together could, theoretically, produce 2,000 nonsequential I/Os per second, or enough to hit the maximum I/O handling capacity of a single RAID controller. If the hard disk drive can sustain 150 sequential I/Os per second, it takes only about 13 hard disk drives working together to produce the same 2,000 sequential I/Os per second and keep the RAID controller running at maximum throughput.

Channel protection planningChannel protection is a good way to make your system more resilient against hardware failures. Channel protection means that you spread your protection arrays across multiple enclosures rather than in one enclosure so that a failure of a single enclosure does not take a whole array offline. A further benefit is a performance increase, because the I/O requests are processed by multiple drive loop pairs on the FAStT. See Figure 2-14.

Tip: More physical disks for the same overall capacity gives you:

� Performance: By doubling the number of the physical drives, you can expect up to a 50% increase in throughput performance.

� Flexibility: Using more physical drives gives you more flexibility to build arrays and logical drives according to your needs.

� Data capacity: When using RAID-5 logical drives, more data space is available with smaller physical drives because less space (capacity of a drive) is used for parity.


Figure 2-14 No channel protection versus channel protection

When using the automatic configuration feature mentioned in 2.3.5, “Planning your storage structure and performance” on page 27, the tool always chooses channel protection across all available enclosures. See Figure 2-15.

Figure 2-15 Automatic configuration feature

2.3.6 Logical drives and controller ownershipLogical drives, sometimes simply referred to as volumes or LUNs (LUN stands for Logical Unit Number and represents the number a host uses to access the logical drive), are the logical segmentation of arrays. A logical drive is a logical structure you create on a storage


subsystem for data storage. A logical drive is defined over a set of drives called an array and has a defined RAID level and capacity (see “RAID levels” on page 28). The drive boundaries of the array are hidden from the host computer.

IBM TotalStorage FAStT Storage Server provides great flexibility in terms of configuring arrays and logical drives (see also Section 3.3.2, “Creating arrays and logical drives” on page 72 for details). However, when assigning logical volumes to the systems, it is very important to remember that the FAStT Storage Server uses a preferred controller ownership approach for communicating with LUNs. This means that every LUN is owned by only one controller. It is, therefore, important at the system level to make sure that traffic is correctly balanced among controllers. This is a fundamental principle for a correct setting of the storage system. See Figure 2-16.

Balancing traffic is unfortunately not always a trivial task. For example, if an application requires large disk space to be located and accessed in one chunk, it becomes harder to balance traffic by spreading the smaller volumes among controllers.

In addition, typically, the load across controllers and logical drives is constantly changing. The logical drives and data accessed at any given time depend on which applications and users are active during that time period, hence the importance of monitoring the system (see 4.2, “Controlling the performance impact of maintenance tasks” on page 80).

Figure 2-16 Balancing LUNs

Assigning ownershipThe preferred owner for a logical drive is initially selected by the controller when the logical drive is created (see Figure 2-17). Select the Array → Change → Ownership/Preferred Path menu option to change the preferred controller ownership for a selected array. To change the preferred controller ownership for a logical drive, select Logical Drive → Change → Ownership/Preferred Path.

Tip: Guidelines for LUN assignment and storage partitioning:

� Assign LUNs across all controllers.

� Unless you have special requirements, use the automatic feature (wizard) of Storage Manager to create your LUNs.

� If you have highly used LUNs, move them away from other LUNs.

ctrl A ctrl BFAStT

VOLUME GROUP 100 GB

LUN RAID-5100 GB

ctrl A ctrl BFAStT

2 LUNs RAID-52 X 50 GB

AIX AIX

VOLUME GROUP 100 GB

LV LV


Figure 2-17 Preferred controller ownership

Figure 2-18 Redistribute logical drives

Important: A secondary logical drive in a Remote Mirror does not have a preferred owner. Instead, the ownership of the secondary logical drive is determined by the controller owner of the associated primary logical drive. For example, if Controller A owns the primary logical drive in the primary storage subsystem, Controller A owns the associated secondary logical drive in the secondary storage subsystem. Controller ownership changes of the primary logical drive cause a corresponding controller ownership change of the secondary logical drive.

To shift logical drives away from their current owners and back to their preferred owners, select Storage Subsystem → Redistribute Logical Drives, as shown in Figure 2-18.

Tip: For the best performance of a redundant controller system, the system administrator should divide I/O activity (LUNs) between the two RAID controllers. This is accomplished through the Storage Manager GUI, or by using the command line interface.


The preferred controller ownership of a logical drive or array is the controller of an active-active pair that is designated to own these logical drives. The current controller owner is the controller that currently owns the logical drive or array.

If the preferred controller is being replaced or undergoing a firmware download, ownership of the logical drives is automatically shifted to the other controller, and that controller becomes the current owner of the logical drives. This is considered a routine ownership change and is reported with an informational entry in the event log.

There can also be a forced failover from the preferred controller to the other controller because of I/O path errors. This is reported with a critical entry in the event log, and will be reported by the Enterprise Management software to e-mail and SNMP alert destinations.

2.3.7 Segment sizeA segment, in a logical drive, is the amount of data, in kilobytes, that the controller writes on a single physical drive before writing data on the next physical drive.

The choice of a segment size can have a major influence on performance in both IOPS and throughput. Small segment sizes increase the request rate (IOPS) by allowing multiple disk drives to respond to multiple requests. Large segment sizes increase the data transfer rate (Mbps) by allowing multiple disk drives to participate in one I/O request.

You can use the performance monitor (see 4.1, “Performance monitoring and tuning” on page 78) to evaluate how a given segment size affects the workload (this, obviously, should be done in a testing environment). Use the following guidelines:

� If the typical I/O size is larger than the segment size, increase the segment size in order to minimize the number of (physical) drives needed to satisfy an I/O request. This is especially true in a multi-user, database, or file system storage environment. Using a single drive for a single request leaves other drives available to simultaneously service other requests.

� If you are using the logical drive in a single-user, large I/O environment (such as for multimedia application storage), performance is optimized when a single I/O request can be serviced with a single data stripe (the segment size multiplied by the number of drives in the array that are used for I/O). In this case, multiple disks are used for the same request, but each disk is only accessed once.

� Normally, a small segment size is used for databases, normal sizes for file server, and large segment sizes for multimedia applications.

� If we increase the segment size, maximum theoretical throughput increases.

Tips: The possible segment size available are 8 KB, 16 KB, 32 KB, 64 KB, 128 KB, and 256 KB:

� Storage Manager sets a default block size of 64 KB.

� For database application block sizes between 4-16 KB have been shown to be more effective.

� In a large file environment, such as on media streaming or CAD, 128 KB and above is recommended.

� For Web servers and file and print servers, the range should be between 16-64 KB.


2.3.8 Storage partitioningStorage partitioning adds a high level of flexibility to the FAStT Storage Server. It enables you to connect to the same storage server multiple and heterogeneous host systems, either in stand-alone or clustered mode. The term Storage partitioning is somewhat misleading as it actually represent a host or a group of hosts and the logical disks they access. Storage partitioning was introduced in Storage Manager Version 7.10.

Without storage partitioning, the logical drives configured on a FAStT Storage Server can only be accessed by a single host system or by a single cluster. This can lead to inefficient use of the storage server hardware.

With storage partitioning, on the other hand, you can create sets of objects containing the hosts with their host bus adapters and the logical drives. We call these sets storage partitions. Now, the host systems can only access their assigned logical drives, just as if these logical drives were locally attached to them.

Storage partitioning lets you map and mask LUNs (that is why it is also referred to as LUN masking). That means after you assigned that LUN to a host, it is hidden to all other hosts connected to the same Storage Server. Therefore, the access to that LUN is exclusively reserved for that host.

It is a good practice to do your storage partitioning prior to connecting to multiple hosts. Operating systems such as AIX or Windows 2000 may write their signatures to any device they can access.

Heterogeneous host support means that the host systems can run different operating systems. But be aware that all the host systems within a particular storage partition must run the same operating system, because all host systems within a particular storage partition have unlimited access to all logical drives in this partition. Therefore, file systems on these logical drives must be compatible with host systems. To ensure this, it is best to run the same operating system on all hosts within the same partition. Some operating systems might be able to mount foreign file systems. In addition, Tivoli® SANergy® or the IBM SAN File System can enable multiple host operating systems to mount a common file system.

Note: You should do performance testing in you environment before you go into production with a given segment size. Segment size can be dynamically changed, but only by rewriting the data, which consumes bandwidth and impacts performance. Plan this carefully to avoid the issue in a production environment.

Note: The are limitations as to how many logical drives you can map per host. FAStT (with Storage Manager Version 8.4) allows up to 256 LUNs per partition (including the access LUN) and a maximum of two partitions per host. Note that a particular OS platform (see Restriction box below) can also impose limitations to the number of LUNs they can support. Keep all these limitations in mind when planning your installation.

Restriction: Most hosts will be able to have 256 LUNs mapped per storage partition. Windows NT, Solaris with RDAC, NetWare 5.1, and HP-UX 11.0 are restricted to 32 LUNs. If you try to map a logical drive to a LUN that is greater than 32 on these operating systems, the host will be unable to access it. Solaris requires use the of Veritas Dynamic Multi-Pathing (DMP) for failover for 256 LUNs.


Storage partition is a collection of topological elements (default group, host groups, hosts, and host ports) shown as nodes in the topology view of the mappings view. You must define the various topological elements if you want to define specific logical drive-to-LUN mappings for host groups, or hosts, or both.

In order to do the storage partitioning correctly, you need the WWN of your HBAs. Mapping is done on a WWN basis. Depending on your HBA, you can obtain the WWN either from the BIOS or FAStT MSJ tool if you have Qlogic cards. Emulex adapters and IBM adapters for pSeries® and iSeries™ servers have a sticker on the back of the card, as do the JNI and AMCC adapters for Solaris. The WWN is also usually printed on the adapter itself and/or the box the adapter was shipped in.

If you are connected to a hub or switch, check the Name Server Table of the hub or switch to identify the WWN of the HBAs.

When planning your partitioning, keep in mind that:

� In a cluster environment, you need to use host groups.� You can optionally purchase partitions.

See 3.3.3, “Configuring storage partitioning” on page 73 for details about how to define your storage partitioning.

2.3.9 Cache parametersCache memory is an area of temporary volatile storage (RAM) on the controller that has a faster access time than the drive media. This cache memory is shared for read and write operations.

Efficient use of the RAID controller cache is essential for good performance of the FAStT storage server.

The diagram shown in Figure 2-19 is a schematic model of the major elements of a disk storage system, elements through which data moves (as opposed to other elements such as power supplies). In the model, these elements are organized into eight vertical layers: four layers of electronic components shown inside the dotted ovals and four layers of paths (that is, wires) connecting adjacent layers of components to each other. Starting at the top in this model, there are some number of host computers (not shown) that connect (over some number of paths) to host adapters. The host adapters connect to cache components. The cache components, in turn, connect to disk adapters that, in turn, connect to disk drives.

Here is how a read I/O request is handled in this model. A host issues a read I/O request that is sent over a path (such as a Fibre Channel) to the disk system. The request is received by a disk system host adapter. The host adapter checks whether the requested data is already in cache, in which case, it is immediately sent back to the host. If the data is not in cache, the request is forwarded to a disk adapter that reads the data from the appropriate disk and copies the data into cache. The host adapter sends the data from cache to the requesting host.


Figure 2-19 Conceptual model of disk caching

Most (hardware) RAID controllers have some form of read or write caching, or both. You should plan to take advantage of this caching capabilities, because they enhance the effective I/O capacity of the disk subsystem. The principle of these controller-based caching mechanisms is to gather smaller and potentially nonsequential I/O requests coming in from the host server (for example, SQL Server) and try to batch them with other I/O requests. Consequently, the I/O requests are sent as larger (32 KB to 128 KB) and possibly sequential requests to the hard disk drives. The RAID controller cache arranges incoming I/O requests by making the best use of the hard disks underlying I/O processing ability. This increases the disk I/O throughput.

There are many different settings (related to caching). When implementing a FAStT Storage Server as part of a whole solution, you should plan at least one week of performance testing and monitoring to adjust the settings.

The FAStT Storage Manager utility enables you to configure various cache settings:

� Read caching� Cache block size� Cache read-ahead multiplier� Write caching or write-through mode� Enable or disable write cache mirroring � Start and stop cache flushing levels� Unwritten cache age parameter

Figure 2-20 shows the default values when using the Create Logical Drive Wizard. With the Storage Manager, you can specify cache settings for each logical drive independently for more flexibility.

. . .

. . .

. . .

. . .

. . .

. . .

. . .

. . .

paths to hosts 1...n

1...n

1...n

1...n

1...n

1...n

1...n

1...n

paths to cache

paths to adapters

paths to disks

host adapters

cache

disk adapters

disk drives


Figure 2-20 Default values used by the Create Logical Drive Wizard

These settings can have a large impact on the performance of the FAStT Storage Server and on the availability of data. Be aware that performance and availability often conflict with each other. If you want to achieve maximum performance, in most cases, you must sacrifice system availability and vice versa.

The default settings are read and write cache for all logical drives, with cache mirroring to the alternate controller for all write data. The write cache is only used if the battery for the controller is fully charged. Read ahead is not normally used on the logical drives.

Read cachingThe read caching parameter can be safely enabled without risking data loss. There are only rare conditions when it is useful to disable this parameter, which then provides more cache for the other logical drives.

Read-ahead multiplierThis parameter affects the reading performance, and an incorrect setting can have a large negative impact. It controls how many additional sequential data blocks will be stored into cache after a read request.

Obviously, if the workload is random, this value should be zero. Otherwise, each read request will unnecessarily pre-fetch additional data blocks. Because these data blocks are rarely needed, the performance is negatively impacted.

For sequential workloads, a good value is between 1 and 4, depending on the particular environment. When using such setting, a read request causes pre-fetching of several sequential data blocks into the cache; this speeds up subsequent disk access. This leads to a fewer number of I/O transfers (between disk and cache) required to handle the same amount of data, which is good for performance in a sequential environment. A value that is too high can cause an overall performance decrease, because the cache is filled with read-ahead data that is never used.

Use the performance monitor to watch the cache hit rate for a logical drive to find a proper value.


Write cachingThe write caching parameter enables the storage subsystem to cache write data instead of writing it directly to the disks. This can improve performance significantly, especially for environments with random writes such as databases. For sequential writes, the performance gain varies with the size of the data written. If the logical drive is only used for read access, it might improve overall performance to disable the write cache for this logical drive. Then, no cache memory is reserved for this logical drive.

Write cache mirroringFAStT write cache mirroring provides the integrity of cached data if a RAID controller fails. This is excellent from a high availability perspective, but it decreases performance. The data is mirrored between controllers across the drive-side FC loop. This competes with normal data transfers on the loop. It is recommended to keep the controller write cache mirroring enabled for data integrity reasons in case of a controller failure.

By default, a write cache is always mirrored to the other controller to ensure proper contents, even if the logical drive moves to the other controller. Otherwise, the data of the logical drive can be corrupted if the logical drive is shifted to the other controller and the cache still contains unwritten data. If you turn off this parameter, you risk data loss in the case of a controller failover, which might also be caused by a path failure in your fabric.

The cache of the FAStT Storage Server is protected, by a battery, against power loss. If the batteries are not fully charged, for example, just after powering on, the controllers automatically disable the write cache. If you enable the parameter, the write cache is used, even if no battery backup is available, resulting in a higher risk of data loss.

Write caching or write-throughWrite-through means that writing operations do not use cache at all. The data is always going to be written directly to the disk drives. Disabling write caching frees up cache for reading (because the cache is shared for read and write operations).

Write caching can increase the performance of write operations. The data is not written straight to the disk drives; it is only written to the cache. From an application perspective, this is much faster than waiting for the disk write operation to complete. Therefore, you can expect a significant gain in application writing performance. It is the responsibility of the cache controller to eventually flush the unwritten cache entries to the disk drives.

Write cache mode appears to be faster than write-through mode, because it increases the performance of both reads and writes. But this is not always true, because it depends on the disk access pattern and workload.

A lightly loaded disk subsystem usually works faster in write-back mode, but when the workload is high, the write cache can become inefficient. As soon as the data is written to the cache, it has to be flushed to the disks in order to make room for new data arriving into cache. The controller would perform faster if the data went directly to the disks. In this case, writing the data to the cache is an unnecessary step that decreases throughput.

Starting and stopping cache flushing levelsThese two settings affect the way the cache controller handles unwritten cache entries. They are only effective when you configure the write-back cache policy. Writing the unwritten cache entries to the disk drives is called flushing. You can configure the start and stop flushing level values. They are expressed as percentages of the entire cache capacity. When the number of unwritten cache entries reaches the start flushing value, the controller begins to flush the cache (write the entries to the disk drives). The flushing stops when the number of unwritten entries drops below the stop flush value. The controller always flushes the oldest cache entries first. Unwritten cache entries older than 20 seconds are flushed automatically.


A typical start flushing level is 80%. Very often, the stop flushing level is set to 80%, too. This means the cache controller does not allow more than 80% of the entire cache size for write-back cache, but it also tries to keep as much of it as possible for this purpose. If you use such settings, you can expect a high number of unwritten entries in the cache. This is good for writing performance, but be aware that it offers less data protection.

If you are concerned about data protection, you might want to use lower start and stop values. With these two parameters, you can tune your cache for either reading or writing performance.

Performance tests have shown that it is a good idea to use similar values for start and stop flushing levels. If the stop level value is significantly lower than the start value, this causes a high amount of disk traffic when flushing the cache. If the values are similar, the controller only flushes the amount needed to stay within limits.

Cache block sizeThis is the size of the cache memory allocation unit and can be either 4 K or 16 K. By selecting the proper value for your particular situation, you can significantly improve the caching efficiency and performance. For example, if applications mostly access the data in small blocks up to 8 K, but you use 16 K for the cache block size, each cache entry block is only partially populated. You always occupy 16 K in cache to store 8 K (or less) of data. This means only up to 50% of the cache capacity is effectively used to store the data. You can expect lower performance. For random workloads and small data transfer sizes, 4 K is better.

On the other hand, if the workload is sequential, and you use large segment sizes, it is a good idea to use a larger cache block size of 16 K. A larger block size means a lower number of cache blocks and reduces cache overhead delays. In addition, a larger cache block size requires fewer cache data transfers to handle the same amount of data.

2.3.10 Hot-spare drive A hot-spare drive is like a replacement drive installed in advance. Hot-spare disk drives provide additional protection that might prove to be essential in case of a disk drive failure in a fault tolerant array.

When assigning disks as hot-spares, make sure they have enough storage capacity. If the failed disk drive is larger than the hot-spare, reconstruction is not possible. You can find more information in 3.3.1, “Defining hot-spare drives” on page 71.

2.3.11 Remote Volume MirroringThe Remote Volume Mirror option is a premium feature that comes with the FAStT Storage Manager Version 8.4 software and is enabled by purchasing a premium feature key. The Remote Volume Mirror option is used for online, real-time replication of data between storage subsystems over a remote distance. See Figure 2-21.

Note: There is no definitive recommendation as to how many hot-spares you should install, but it is common practice to use a ratio of one hot-spare for two to three fully populated expansion enclosures (this proves to be sufficient, because disk reliability has improved).


Figure 2-21 Remote Volume Mirroring

The mirroring is managed by the storage subsystem controllers and is transparent to the host machines and applications. You create one or more mirrored logical drive pairs that consist of a primary logical drive at the primary site and a secondary logical drive at a remote site. After you create the mirror relationship between the two logical drives, the controller owner of the primary logical drive copies all of the data from the primary logical drive to the secondary logical drive. This is called a full synchronization.

In the event of a disaster or unrecoverable error at one storage subsystem, the Remote Volume Mirror (RVM) option enables you to promote a secondary storage subsystem to take over responsibility for normal input/output (I/O) operations.

A mirroring relationship is on a logical drive basis:

� It associates two logical drives (primary and secondary) using Storage Manager software.� Data is copied to a secondary logical drive in the background.

The mirroring is synchronous. The write must be completed to both volumes before the host receives an I/O complete. This obviously has an effect on performance.

A minimum of two storage subsystems is required. One storage subsystem can have primary volumes being mirrored to arrays on other storage subsystems and hold secondary volumes from other storage subsystems. Also note that because replication is managed on a per-logical drive basis, you can mirror individual logical drives in a primary storage subsystem to appropriate secondary logical drives in several different remote storage subsystems.

FCFabric

Primary Data Center Backup Data Center

Network Network

V3mV2V1 V3V2mV1m

StorageArray

SANSAN

StorageArray


Intersite with FlashCopy drives and tape backupThe highest availability configuration is fully redundant and includes two storage subsystems and four Fibre Channel switches connected with Inter-Switch Links (ISLs) forming Fibre Channel fabrics, as shown in Figure 2-22.

Figure 2-22 Intersite high availability solution

Apart from the greater redundancy of dual switch fabric in each site, a greater number of host ports are now available, allowing greater flexibility in use and connectivity.

With this type of configuration, consider putting the primary drives on the remote site. This offers several advantages. The first is if the primary site fails, the standby servers in the secondary site can be attached to the original primary disk with a simple procedure.

With the primary drives on the remote site and the secondary drive in the local site providing an up-to-date copy at all times, it is still possible through programming or human error to corrupt data and the data corruption to be mirrored to the secondary drives. You now have several different options.

Note: RVM requires one (and only one) link per controller. Two RVM volumes on both controllers require two host side ports (one per controller).

RVM uses dedicated host ports for the copying operations (there is no sharing of mini-hubs). For redundancy of the RVM link, you need to account for two mini hubs.

In addition, a switch is required on each end of the fabric connecting the primary and secondary sites.


You can:

� Make a FlashCopy of the data on the primary drive.

� Make a tape backup of the data from the primary drive.

� Combine both, where a FlashCopy is performed and then a tape backup of the copied drive is performed.

2.4 Additional planning considerationsIn this section, we review additional elements to consider when planning your FAStT storage subsystems. These considerations include whether using a Logical Volume Manager or not, multipath drivers, failover alert delay, and others.

2.4.1 Planning for systems with LVM: AIX exampleMany modern UNIX operating systems implement the concept of a Logical Volume Manager (LVM) that can be used to manage the distribution of data on physical disk devices.

The LVM for AIX is a set of operating system commands, library subroutines, and other tools used to control physical disk resources by providing a simplified logical view of the available storage space. Some of the vendors offer LVM as a special product. The AIX LVM is an integral part of the base AIX operating system and is provided at no additional cost.

In this section, we discuss the advantages of having the LVM in the system.

With a UNIX operating system that has LVM, the handling of disk-related I/O is based upon different functional levels, as shown in Figure 2-23.

Figure 2-23 Different functional levels

The lowest level is the physical level and consists of device drivers accessing the physical disks and using the corresponding adapters. The next level is the logical level, managed by the Logical Volume Manager (LVM), which controls the physical disk resources. The LVM provides a logical mapping of disk resources to the application level. The application level can consist of either the journaled file system (JFS) or raw access (for example, used by relational database systems).

Within the AIX LVM, there are five basic logical storage concepts: physical volumes, volume groups, physical partitions, logical volumes, and logical partitions. The relationships among these concepts are depicted in Figure 2-24.


With the AIX LVM:

� Each individual fixed-disk drive (for FAStT, it is referred to as a LUN) is called a physical volume (PV) and has a name (for example, hdisk0, hdisk1, or hdisk2).

� All physical volumes belong to one volume group (VG).

� All of the physical volumes in a volume group are divided into physical partitions (PPs) of the same size.

� Within each volume group, one or more logical volumes (LVs) are defined. Logical volumes are groups of information located on physical volumes. Data on logical volumes appear contiguous to the user, but can be spread on multiple physical volumes.

� Each logical volume consists of one or more logical partitions (LPs). Each logical partition corresponds to at least one physical partition. If mirroring is specified for the logical volume, additional physical partitions are allocated to store the additional copies of each logical partition (with FAStT, this is not recommended, because FAStT can do the mirroring).

� Logical volumes can serve a number of system purposes (paging, for example), but each logical volume that holds ordinary systems, user data, or programs, contains a single journaled file system (JFS). Each JFS consists of a pool of page-size blocks. In AIX Version 4.1 and later, a given file system can be defined as having a fragment size of less than 4 KB (512 bytes, 1 KB, 2 KB).

Figure 2-24 AIX LVM conceptual view

When using FAStT with operating systems that have a built-in LVM, or if a LVM is available, you should make use of the LVM.

The AIX LVM provides a number of facilities or policies for managing both the performance and availability characteristics of logical volumes. The policies that have the greatest impact on performance in general disk environment are the intra-disk allocation, inter-disk allocation, write scheduling, and write-verify policies.

PhysicalPartitions

PhysicalPartitions

PhysicalPartitions

Physical Volume

Physical Volume

Physical Volume

ROOTVGROOTVG

LogicalPartitions

Logical Volume

hdisk1

hdisk2

hdisk0

Three physical disks in the box forma single volume group (rootvg).


Because FAStT systems has its own RAID arrays and logical volumes, we do not work with real physical disks in the system. Functions, such as intra-disk allocation, write scheduling, and write-verify policies, do not help much, and it is hard to determine the performance benefits when using them. They should only be used after additional testing, and it is not unusual that trying to use these functions will lead to worse results.

On the other hand, we should not forget about the important inter-disk allocation policy.

Inter-disk allocation policyThe inter-disk allocation policy is used to specify the number of disks that contain the physical partitions of a logical volume. The physical partitions for a given logical volume can reside on one or several disks in the same volume group, depending on the setting of the range option.

By setting the inter-physical volume allocation policy to maximum, you also ensure that the reads and writes are shared among PVs, and in systems like FAStT, also among controllers and communication paths.

If systems are using only one big volume, it is owned by one controller, and all the traffic goes through one path only. This happens because of the static load balancing that FAStT controllers use. See Figure 2-25 for an illustration.

When using LVM mirroring, use an intra-disk allocation policy of maximum in order to spread the physical partitions of the logical volume across as many physical disks, controllers, and communication paths as possible.

Figure 2-25 Inter-disk allocation

2.4.2 Planning for systems without LVM: Windows exampleToday, the Microsoft Windows operating system does not have a powerful LVM like some of the UNIX systems (in fact, some UNIX systems also have poor LVM). Distributing the traffic among controllers in such environment might be a little bit harder. Actually, Windows systems have an integrated reduced version of Veritas Volume Manager called Logical Disk Manager (LDM), but it does not offer the same flexibility as regular LVM products.

ctrl A ctrl BFAStT

VOLUME GROUP 100 GB

LUN RAID-5100 GB

ctrl A ctrl BFAStT

2 LUNs RAID-52 X 50 GB

AIX AIX

VOLUME GROUP 100 GB

LV LV


If you need greater performance and more balanced systems, you have two options:

� If you want UNIX-like capabilities, you can use the full version of Veritas Volume Manager, which is the LVM tool for Windows. With this product, you get several features that go beyond LDM. Volume Manager does not just replace the Microsoft Management Console (MMC) snap-in, it adds a much more sophisticated set of storage services to Windows 2000. After Windows 2000 is upgraded with Volume Manager, you are able to manage better multidisk direct server-attached (DAS) storage, JBODs (just a bunch of disks), Storage Area Networks (SANs), and RAID.

The main feature that you get are sub-disks and disk groups. You can divide a dynamic disk into one or more sub-disks. A sub-disk is a set of contiguous disk blocks that represent a specific portion of a dynamic disk, which is mapped to a specific region of a physical disk. A sub-disk is a subsection of a dynamic disk's public region. A sub-disk is the smallest unit of storage in Volume Manager. Therefore, sub-disks are the building blocks for Volume Manager arrays. A sub-disk can be compared to a physical partition.

With disk groups, you can organize disks into logical collections. You assign disks to disk groups for management purposes, such as to hold the data for a specific application or set of applications. A disk group can be compared to a volume group. By using these concepts, you can make a disk group with more LUNs that are spread among the controllers.

� Another possibility is to look at the application layer and try to spread databases or application data in smaller chunks that reside on LUNs owned by different controllers. For example, instead of one database defined as a 200 GB file (on one 200 GB LUN), define two database files of 100 GB that side on different LUNs (owned by different controllers).

Tuning the applications and databases vary from product to product, so we do not describe it here more in detail.

Using Veritas Volume Manager and tuning the databases and applications go beyond the scope of this guide. You should look for more information about the application vendor sites or refer to the vendor documentation.

For Veritas Volume Manager (VxVM), see:

http://www.veritas.com/products/category/ProductDetail.jhtml?productId=volumemanagerwin

Note that Veritas also offers VxVM also for other platforms, not just Windows.

Operating systems and applicationsThere are big differences among operating systems when it comes to tuning. While Windows 2000 or 2003 does not give any possibility to tune the operating system itself, the different flavors of UNIX, such as AIX or Linux, give the user a greater variety of parameters that can be set. These details are beyond the scope of this paper. Consult the specific operating system vendor Web site for further information.

The same is true for tuning specific applications or database systems. There is a large variety of systems and vendors, and you should refer to the documentation provided by those vendors for how to best configure your FAStT Storage Server.

Note: At the time of writing this book, the only OS tested and supported by IBM with Veritas Volume Manager is Sun Solaris. Windows 2000 and 2003 have not been tested. If you use Volume Manager for Solaris, you can only use the RDAC or DMP driver. You cannot use both at the same time.


http://www.veritas.com/products/category/ProductDetail.jhtml?productId=volumemanagerwin

2.4.3 The function of ADT and a multipath driverIn a FAStT Storage Server equipped with two controllers, you can provide redundant I/O paths with the host systems. There are two different components that provide this redundancy: the Auto Logical Drive Transfer (ADT) and a multipath driver, such the RDAC.

The RDAC multipath driverThe Redundant Disk Array Controller (RDAC) is an example of a multipath device driver that provides controller failover support when a component on the Fibre Channel I/O path fails.

When you create a logical drive, you assign one of the two active controllers to own the logical drive (called preferred controller ownership) and to control the I/O between the logical drive and the application host along the I/O path. The preferred controller normally receives the I/O requests from the logical drive. If a problem along the data path (such as a component failure) causes an I/O to fail, the multipath driver issues the I/O to the alternate controller.

The redundant disk array controller (RDAC) driver manages the Fibre Channel I/O path failover process for storage subsystems in Microsoft Windows NT 4, Windows 2000, Windows Server 2003, IBM AIX, Sun Solaris, and Linux (Storage Manager v. 8.4 and later only) environments with redundant controllers.

RDAC must be installed on the host system; when two RAID controllers are installed in the FAStT (as is the case for most models), and if one of the RAID controllers fails or becomes inaccessible due to connectivity problems, RDAC reroutes the I/O requests to the other RAID controller. When you have two HBAs (or you could have only one RAID controller installed and connected through a switch to a host equipped with two HBAs), and one of the HBAs fails, RDAC switches over the other I/O path (that is, failover at the host level).

Auto-Logical Drive Transfer feature (ADT) ADT is a built-in feature of the controller firmware that enables logical drive-level failover, rather than controller-level failover. ADT is disabled by default and is automatically enabled based on the failover options supported by the host type you specified.

In other words, the same storage subsystem can operate in both modes. For example, if we have Linux and Windows hosts, both attached to a FAStT900, the FAStT900 can present ADT mode to the Linux server for its LUNs, and it can present RDAC mode to the LUNs mapped to the Windows host.

Notes: A multipath device driver, such as RDAC, is not required when the host operating system, HP-UX, for example, has its own mechanism to handle multiple I/O paths.

Veritas Logical Drive Manager with Dynamic Multi-Pathing (DMP) is another example of a multipath driver. This multipath driver requires Array Support Library (ASL) software, which provides information to the Veritas Logical Drive manager for setting up the path associations for the driver.

Note: Starting with Storage Manager Version 8.2, ADT is set by the host type and on a per-LUN basis. This means that heterogeneous support is now extended across all operating system types. (With FAStT Storage Manager Version 7.10, ADT had to be disabled on a controller basis if an operating system that did not support ADT, such as AIX, was used. This restricted heterogeneous support.)


Default settings for failover protection The storage management software uses the following default settings, based on the host type:

� Multipath driver software with ADT enabled:

This is the normal configuration setting for Novell Netware, Linux (when using FC HBA failover driver instead of RDAC), and Hewlett Packard HP-UX systems. When ADT is enabled and used with a host multipath driver, it helps ensure that an I/O data path is available for the storage subsystem logical drives. The ADT feature changes the ownership of the logical drive that is receiving the I/O to the alternate controller. After the I/O data path problem is corrected, the preferred controller automatically reestablishes ownership of the logical drive as soon as the multipath driver detects that the path is normal again.

� Multipath driver software with ADT disabled:

This is the configuration setting for Microsoft Windows, IBM AIX, and Sun Solaris and Linux (when using the RDAC driver and non-failover Fibre Channel HBA driver) systems. When ADT is disabled, the I/O data path is still protected as long as you use a multipath driver. However, when an I/O request is sent to an individual logical drive and a problem occurs along the data path to its preferred controller, all logical drives on the preferred controller are transferred to the alternate controller. In addition, after the I/O data path problem is corrected, the preferred controller does not automatically reestablish ownership of the logical drive. You must open a storage management window, select Redistribute Logical Drives from the Advanced menu, and perform the Redistribute Logical Drives task.

� No multipath driver software on the host and ADT enabled on the storage subsystem (no failover):

This case is not supported.

The FAStT storage subsystems in this scenario have no failover protection. A pair of active controllers might still be located in a storage subsystem and each logical drive on the storage subsystem might be assigned a preferred owner. However, logical drives do not move to the alternate controller because there is no multipath driver installed. When a component in the I/O path, such as a cable or the controller itself, fails, I/O operations cannot get through to the storage subsystem. The component failure must be corrected before I/O operations can resume. You must switch logical drives to the alternate controller in the pair manually.

Load balancing with RDAC (round robin)Round-robin (load distribution or load balancing) is used when the RDAC driver discovers that there are multiple data paths from the host to an individual controller. In such a configuration, it is assumed that no penalty is incurred for path switches that do not result in a controller ownership change, thereby enabling the multipath driver to exploit redundant I/O path bandwidth by distributing (in a round-robin fashion) I/O requests across paths to an individual controller.

Note: In ADT mode, RDAC automatically redistributes the LUNs to their preferred path after the failed path is again operational.

Note: In non-ADT mode, the user is required to issue a redistribution command manually to get the LUNs balanced across the controllers.


The RDAC drivers for Windows and AIX support round-robin load balancing. It is, however, not recommended to enable load balancing for AIX.

2.4.4 ADT alert notificationWith Storage Manager Version 8.4, an ADT alert notification is provided. This accomplishes three things:

� It provides notifications for persistent “Volume not on preferred controller” conditions that resulted from ADT.

� It guards against spurious alerts by giving the host a “delay period” after a preferred controller change, so it can get reoriented to the new preferred controller.

� It minimizes the potential for the user or administrator to receive a flood of alerts when many logical drives failover at nearly the same point in time due to a single upstream event, such as an HBA failure.

Upon an ADT event or an induced volume ownership change, the FAStT controller firmware waits for a configurable time interval, called the alert delay period, after which it reassesses the logical drives distribution among the arrays.

If, after the delay period, some logical drives are not on their preferred controllers, the controller that owns the not-on-preferred-logical drive logs a critical Major Event Log (MEL) event. This event triggers an alert notification, called the logical drive transfer alert. The critical event logged on behalf of this feature is in addition to any informational or critical events that are already logged in the RDAC. This can bee seen in the Figure 2-26.

Figure 2-26 Example of alert notification in MEL of an ADT/RDAC logical drive failover

2.4.5 Failover alert delayThe failover alert delay lets you delay the logging of a critical event if the multipath driver transfers logical drives to the non-preferred controller. If the multipath driver transfers the

Note: Volume controller ownership changes occur as a normal part of a controller firmware download. However, the logical-drive-not-on-preferred-controller events that occur in this situation, will not result in an alert notification.


logical drives back to the preferred controller within the specified delay period, no critical event is logged. If the transfer exceeds this delay period, a logical drive-not-on-preferred-path alert is issued as a critical event. This option also can be used to minimize multiple alerts when many logical drives failover because of a system error, such as a failed host adapter.

The logical drive-not-on-preferred-path alert is issued for any instance of a logical drive owned by a non-preferred controller and is in addition to any other informational or critical failover events. Whenever a logical drive-not-on-preferred-path condition occurs, only the alert notification is delayed; a needs attention condition is raised immediately.

To make the best use of this feature, set the failover alert delay period such that the host driver failback monitor runs at least once during the alert delay period. Note that a logical drive ownership change might persist through the alert delay period, but correct itself before you can inspect the situation. In such a case, a logical drive-not-on-preferred-path alert is issued as a critical event, but the array will no longer be in a needs-attention state. If a logical drive ownership change persists through the failover alert delay period, refer to the Recovery Guru for recovery procedures.

Changing the failover alert delayTo change the failover alert delay:

1. Select the storage subsystem from the Subsystem Management window, and then select either the Storage Subsystem → Change → Failover Alert Delay menu option, or right-click and select Change → Failover Alert Delay. See Figure 2-27.

Important:

� The failover alert delay option operates at the storage subsystem level, so one setting applies to all logical drives.

� The failover alert delay option is reported in minutes in the Storage Subsystem Profile as a storage subsystem property.

� The default failover alert delay interval is five minutes. The delay period can be set within a range of 0 to 60 minutes. Setting the alert delay to a value of zero results in instant notification of a logical drive not on the preferred path. A value of zero does not mean alert notification is disabled.

� The failover alert delay is activated after controller start-of-day completes to determine if all logical drives were restored during the start-of-day operation. Thus, the earliest that the not-on-preferred path alert will be generated is after boot up and the configured failover alert delay.


Figure 2-27 Changing the failover alert delay

The Failover Alert Delay dialog box opens, as seen in Figure 2-28.

Figure 2-28 Failover Alert Delay dialog box

2. Enter the desired delay interval in minutes and click OK.

You are returned to the Subsystem Management window.


Chapter 3. FAStT configuration tasks

This chapter recommends a sequence of tasks to set up, install, and configure the IBM TotalStorage FAStT Storage Server, including these:

� Setting up the IP addresses on the FAStT Storage Server� Cabling the FAStT Storage Server� Installing the FAStT Storage Manager Client� Updating the BIOS and firmware of the Storage Server� Initial setup of the Storage Server� Defining logical drives and hot-spares� Setting up storage partitioning

3


3.1 Preparing the FAStT Storage ServerIt is assumed that you have installed the operating system on the host server, have all the necessary device drivers and host software installed, and have a good understanding and working knowledge of the FAStT Storage Server product. If you require detailed information about how to preform the installation, setup, and configuration of this product, refer to the IBM Redbook, IBM TotalStorage FAStT900/600 and Storage Manager 8.4, SG24-7010, at:

http://www.ibm.com/redbooks

Or, consult the documentation for your operating system and host software.

3.1.1 Network setup of the controllers

By default, FAStT tries to use the bootstrap protocol (BOOTP) to request an IP address. If no BOOTP server can be contacted, the controllers fall back to the fixed IP addresses. These fixed addresses, by default, are:

� Controller A: 192.168.128.101� Controller B: 192.168.128.102

To use the network ports of the controllers, you need to attach both controllers to an Ethernet switch or hub. The built-in Ethernet controller supports either 100 Mbps or 10 Mbps.

To manage storage subsystems through a firewall, configure the firewall to open port 2463 for TCP data.

To change the default network setting (BOOTP with fallback to a fixed IP address), you need a serial connection to the controllers in the FAStT Storage Server.

To set up the controllers:

1. Connect to the FAStT Storage Server with a null modem cable to the serial port of your system. For the serial connection, choose the correct port and the following settings:

– 19200 Baud– 8 Data Bits– 1 Stop Bit– No Parity– Xon/Xoff Flow Control

2. Send a break signal to the controller. This varies depending on the terminal emulation. For most terminal emulations, such as HyperTerm, which is included in Microsoft Windows products, press Ctrl+Break.

3. If you only receive unreadable characters, press Ctrl+Break again, until the following message appears:

Press <SPACE> for baud rate within 5 seconds.

Tip: With Version 8.4x of the FAStT Storage Manager Client, and assuming you have the appropriate firmware level for the controllers, it is also possible to set the network settings using the SM Client graphical front end.

Attention: Follow the procedure outlined here exactly as it is presented, because some commands that can be issued from the serial console can have destructive effects (causing loss of data or even affecting the functionality of your FAStT).



4. Press the Space bar to ensure the correct baud rate setting. If the baud rate was set, a confirmation appears.

5. Press Ctrl+Break to log on to the controller. The following message appears:

Press within 5 seconds: <ESC> for SHELL, <BREAK> for baud rate.

6. Press the Esc key to access the controller shell. The password you are prompted for is infiniti.

7. Run the netCfgShow command to see the current network configuration.

8. To change these values, enter the netCfgSet command. For each entry, you are asked to keep, clear, or change the value. After you assign a fixed IP address to Controller A, disconnect from Controller A and repeat the procedure for Controller B. Remember to assign a different IP address.

9. Because the configuration changed, the network driver is reset and uses the new network configuration.

3.1.2 Installing and starting the FAStT Storage Manager ClientYou can install the FAStT Storage Manager Client (SM Client) for either in-band management or out-of-band management. It is possible to use both on the same machine if you have a TCP/IP connection and a Fibre Channel connection to the FAStT Storage Server.

In-band management uses the Fibre Channel to communicate with the FAStT Storage Server, and out-of-band management uses the TCPIP network to communicate with the FAStT Storage Server. In our example, we use the out-of-band management and install the Storage Manager Client on a machine that only has a TCP/IP connection to the FAStT Storage Server.

If you are unable to use a separate network, ensure that you have an adequate password set on your FAStT Storage Server.

There are some advantages for doing out of band management using a separate network. First, it makes the storage more secure and limits the number of people that have access to the storage management functions. Second, it provides more flexibility, because it eliminates the need for the storage administrator to access the server console for administration tasks. In addition, the Storage Manager agent and software do not take up resources on the host server.

Installing the SMclientWe assume for this illustration that the SM Client is to be installed on a Microsoft Windows workstation, as is commonly the case. However, the SM Client is available for other OS platforms, such as AIX.

To install the SMclient on a Windows operating system, perform the following steps:

1. Insert the IBM TotalStorage FAStT Storage Manager Version 8.4 CD into the CD-ROM drive.

2. Click Start → Browse → %source_path%\displayed.exe to Run. The Run window opens.

3. Follow the instructions presented through the installation.

Tip: For ease of management and security, we recommend installing a management workstation on a separate network.

Chapter 3. FAStT configuration tasks 59

Starting the SMclientWhen you start the FAStT Storage Manager Client, it launches the Enterprise Management window. The first time you start the client, you are prompted to select whether you want an initial discovery of available storage subsystems (see Figure 3-1).

Figure 3-1 Initial Automatic Discovery

The client software sends out broadcasts through Fibre Channel and the subnet of your IP network if it finds directly attached storage subsystems or other hosts running the FAStT Storage Manager host agent with an attached storage subsystem.

You have to invoke the Automatic Discovery every time you add a new FAStT Storage Server in your network or install new host agents on already attached systems. To have them detected in your Enterprise Management window, click Tools → Rescan. Then, all FAStT Storage Servers are listed in the Enterprise Management window, as shown in Figure 3-2.

If you are connected through FC and TCP/IP, you will see the same FAStT Storage Server twice.

Figure 3-2 Enterprise Management window

The FAStT Storage Server can be connected through Ethernet, or you might want to manage it through the host agent of another host, which is not in the same broadcast segment as your management station. In either case, you have to add the devices manually. Click Edit → Add device and enter the host name or the IP address you want to attach. If you add a FAStT Storage Server that is directly managed, be sure to enter both IP addresses, one per controller. You receive a warning message from Storage Manager if you only assign an IP address to one controller.

Note: When you install FAStT Storage Manager Client on a stand-alone host and manage storage subsystems through the Fibre Channel I/O path, rather than through the network, you must install the TCP/IP software on the host and assign an IP address to the host.


To choose the storage subsystem you want to manage, right-click and select Manage Device for the attached storage subsystem. This launches the Subsystem Management window (Figure 3-3).

Figure 3-3 First launch of the Subsystem Management window

Verify that the enclosures in the right side of the window reflects your physical layout. If the enclosures are listed in an incorrect order, select Storage Subsystem → Change → Enclosure Order and sort the enclosures according to your site setup.

3.1.3 Updating the controller microcodeIt is recommended that your FAStT Storage Server always be at a recent level of microcode. Occasionally, IBM will withdraw older levels of microcode from support. In this case, an upgrade to the microcode is mandatory. In general, you should plan on upgrading all drivers, microcode, and management software in your SAN on a periodic basis. New code levels may contain important fixes to problems you may not have encountered yet.

The microcode of the FAStT Storage Server consists of two packages:

� The firmware� The NVSRAM package, including the settings for booting the FAStT Storage Server

The NVSRAM is similar to the settings in the BIOS of a host system. The firmware and the NVSRAM are not independent. Be sure to install the correct combination of the two packages. To update the controller microcode:

1. The upgrade procedure needs two independent connections to the FAStT Storage Server, one for each controller. It is not possible to perform a microcode update with only one controller connected. Therefore, both controllers must be accessible either through Fibre Channel or Ethernet. Both controllers must also be in the active state.

If you plan to upgrade through Fibre Channel, make sure that you have a multipath I/O driver installed on your management host, for example, the FAStT RDAC package. This is necessary, because the access logical drive moves from one controller to the other during this procedure, and the FAStT Storage Server must be manageable during the entire time.

Tip: You can be automatically notified by e-mail whenever an update is available. Refer to 4.5.1, “Being up-to-date with your drivers and firmware using My support” on page 88.


2. Open the Subsystem Management window for the FAStT Storage Server you want to upgrade.

To download the firmware, highlight the storage subsystem. From the Storage Subsystem menu, click Download → Firmware.

You might also be asked to synchronize the clocks on the FAStT Storage Server with the host that you are using.

3. After you upgrade the firmware, you must also upgrade NVSRAM.

Highlight the storage subsystem again and click Storage Subsystem → Download → NVSRAM.

Because the NVSRAM is much smaller than the firmware package, it does not take as long as the firmware download.

After the upgrade procedure, it is not necessary to power cycle the FAStT. After the download, the controllers are rebooted automatically one by one and the FAStT Storage Server is online again.

If the FAStT Storage Server is not recognized or unresponsive after the upgrade in the Enterprise Management windows, remove the device from the Enterprise Management window and initiate a new discovery. If the FAStT Storage Server is still unresponsive, reboot the host system and initiate a discovery when the system is up again. This can be caused by the host agent not recognizing properly the updated FAStT.

3.2 FAStT cablingIn the following sections, we explain the typical recommended cabling configuration for the FAStT600 and FAStT900, respectively.

3.2.1 FAStT600 cabling configurationThe basic design point of a FAStT Storage Server is to have hosts directly attach it.

The best practice for attaching host systems to your FAStT storage is to use fabric attach (SAN attached), with Fibre switches, as explained in the second part of this section. For a simple installation, it is, however, possible and acceptable to direct attach the FAStT600 to a single host with two HBAs.

Important:

� Ensure that all hosts attached to FAStT have a multipath I/O driver installed.

� Any power or network/SAN interruption during the update process can lead to configuration corruption. Therefore, do not turn off the power to the FAStT Storage Server or the management station during the update.

� If you use Fibre Channel hubs as SAN devices (as opposed to switches) or if you use directly attached hosts, do not perform the update over in-band management. This can cause a loop initialization process (LIP) and interrupt the update process.

Note: If you applied any changes to the NVSRAM settings, for example, running a script, you must re-apply them after the download of the new NVSRAM completes. The NVSRAM update resets all settings stored in the NVSRAM to their defaults.


FAStT600 direct attachThe FAStT600 offers fault tolerance on both HBAs and FAStT controllers. At the same time, you can get higher performance, because the dual controller allows for distribution of the load. See the left side of Figure 3-4.

Figure 3-4 FAStT600 cabling configuration

FAStT600 supports a dual-node cluster without using a switch. This is shown on the right side of Figure 3-4. This provides the lowest priced solution for 2-node clusters due to four Fibre Channel host ports.

FAStT600 Fibre switch attached The recommended configuration is to connect the FAStT600 to Fibre switches to expand its connection for multi servers, as shown in Figure 3-5.

Multiple hosts can access a single FAStT system, but also have the capability of accessing data on any FAStT subsystem within the SAN. This configuration allows more flexibility and growth capability within the SAN: The attachment of new systems is made easier when adopting such structured cabling techniques.

HBA2

Host system A with 2 HBAs

FAStT600

HBA1 HBA2


FAStT600

HBA1 HBA2

Host system B with 2 HBAs

HBA1

A B BA


Figure 3-5 FAStT600 connected through managed hub or Fibre switches

Figure 3-6 shows an example of a dual FAStT600 configuration in a SAN fabric.

Figure 3-6 Dual FAStT600 connected through Fibre switches

HBA1 HBA2


HBA1 HBA2


FAStT600

Switch 1 Switch 2

HBA1 HBA2


HBA1 HBA2


Switch 1 Switch 2

FAStT600


The FAStT 600 Turbo supports up 7 EXPansion units (on the base FAStT 600, you can have up to three expansion units, but you need to purchase a License). The diagram in Figure 3-7 shows the connection scheme with two expansion units.

Figure 3-7 Dual expansion unit Fibre Channel cabling

Please note that in order to have path redundancy you need to connect a multipath loop to the FAStT600 from the EXP700. As shown in Figure 3-7, Loop A is connected to Controller A, and Loop B is connected to Controller B: If there was a break in one of the fiber cables, the system would still have a path for communication with the EXP700, thus providing continuous uptime and availability.

3.2.2 FAStT700/FastT900 cabling configuration Figure 3-8 illustrates the rear view of a FAStT900. There are up to four host mini-hubs (two are standard). The mini-hubs numbered 1 and 3 correspond to the top controller (controller A), and mini-hubs 2 and 4 correspond to the bottom controller (controller B).

Note: Although storage remains accessible, Storage Manager will report a path failure and request that you check for a faulty cable connection to the FAStT.

FAStT600

Second EXP700

First EXP700

Cntrl A Cntrl B

In Port Out Port In Port Out Port

Left ESM Board Right ESM Board

Loop ALoop B


Figure 3-8 Rear view of the FAStT900 Storage Server

To ensure redundancy, you must connect each host to both RAID controllers (A and B).

Figure 3-9 illustrates a direct connection of hosts (each host must be equipped with two host adapters).

Figure 3-9 Connecting hosts directly to the controller

Figure 3-10 illustrates the recommended dual path configuration using Fibre Channel switches (rather than direct attachment). Host 1 contains two HBAs that are connected to host mini hubs. To configure a host with dual path redundancy, connect the first host bus adapter (HA1) to SW1 and HA2 to SW2. Then, connect SW1 to host mini hub 1 and SW2 to host mini hub 2.

1 2 3 4

CtrlB

CtrlA

Host Mini-hubports

In port

Drive Mini-hubs

Out port

1 2 3 4

FAStT900


Figure 3-10 Using two Fibre Channel switches to connect a host

Devices can be dynamically added to the mini hubs. A Fibre Channel loop supports 127 addresses. This means the FAStT900 can support up to 8 EXP700 expansion enclosures, or 11 EXP500 expansion enclosures per drive loop, for a total of 112 or 110 drives being addressed.

Because two fully redundant loops can be set, we can connect up to 16 EXP700 expansion enclosures or 22 EXP500 expansion enclosures, for a total of up to 224 disk drives (if using the EXP700) or 220 disk drives (if using the EXP500) without a single point of failure.

On the drive-side mini hub, one SFP module port is marked as IN, the other one as OUT. We recommend that you always connect outgoing ports on the FAStT900 to incoming ports on EXP700. This will ensure clarity and consistency in your cabling making it easier and more efficient to maintain or troubleshoot.

For the FAStT900 drive-side Fibre Channel cabling, as shown in Figure 3-11 on page 69:

1. Start with the first expansion unit of drive enclosures group 1 and connect the In port on the left ESM board to the Out port on the left ESM board of the second (next) expansion unit.

2. Connect the In port on the right ESM board to the Out port on the right ESM board of the second (next) expansion unit.

3. If you are cabling more expansion units to this group, repeat steps 1 and 2, starting with the second expansion unit.

Tip: It is recommended that you remove small form factor plug (SFP) modules in unused mini hubs.

Important: It is recommended that you use an EXP700 with the FAStT900. If any EXP500 is connected into the loop, you should manually set the 1 Gbps speed switch to force all the devices and hosts connected to this FAStT900 to work at 1 Gbps speed. With some EXP700 units, the switch is behind a cover plate screwed to the back of the unit. (The plate prevents the switch from being moved inadvertently.


4. If you are cabling a second group, repeat step 1 to step 3 and reverse the cabling order; connect from the Out ports on the ESM boards to the In ports on successive expansion units according to the illustration on the left. See Figure 3-11.

5. Connect the Out port of drive-side mini hub 4 (far left drive side) to the In port on the left ESM board of the last expansion unit in the drive enclosures group 1.

6. Connect the In port of drive-side mini hub 3 to the Out port on the right ESM board of the first expansion unit in the drive enclosures group 1.

7. If you are cabling a second group, connect the Out port of the drive-side mini hub 2 to the In port on the left ESM board of the first expansion unit in drive enclosures group 2. Then, connect the In port of the drive-side mini hub 1 (far right drive side) to the Out port on the right ESM board of the last expansion unit in Drive enclosures group 2.

8. Ensure that each expansion unit has a unique ID (switch setting) and that the left and right ESM board switch settings on each expansion unit are identical.

Tip: Figure 3-11 shows that there are two drive loop pairs (A/B, and C/D). The best practice is to distribute the storage (the EXP units) configuration among all of the two available drive loop pairs for redundancy and better performance.

When you have multiple expansion enclosures, it is always best to create RAID arrays across the expansion enclosures for channel protection and redundancy.


Figure 3-11 FAStT900 drive-side Fibre Channel cabling

Drive Loop AIn Out In Out

Expansion EnclosuresGroup 1

Drive Loop B

Drive Loop C Drive Loop D

Expansion EnclosuresGroup 2

First EXP. Unit

Last EXP. Unit

First EXP. Unit

Last EXP. Unit


3.2.3 Expansion unit numberingFor reasons having to do with the address assignment scheme for the drives, you should observe a few guidelines when assigning drawer ID’s to your expansion units: Failure to adhere to these guidelines can cause issues with I/O error recovery, and make the troubleshooting of certain drive communication issues difficult:

� Try to limit the number of expansion units to eight for each drive loop pair.

� Ensure that the least significant digit in the drawer ID is unique on each drive loop pair. For instance, a loop of purely EXP700’s should be numbered 0-7 for the first loop pair and 20-27 for the second loop pair.

� If you must use more than eight drawers in a loop pair (using EXP500’s), or if you are intermixing EXP500 or EXP700 units, you MUST refer to the Fibre Channel Hard Drive and Storage Expansion Enclosure Installation and Migration Guide — IBM Pub# GC26-7639, available at:

http://www-1.ibm.com/support/docview.wss?uid=psg1MIGR-55466

3.3 Configuring the FAStT Storage ServerNow that you have set up the Storage Server and it is connected to a server or the SAN, you can proceed with additional configuration and storage setting tasks.

Before defining arrays or logical drives, you must perform some basic configuration steps. This also applies when you reset the configuration of your FAStT Storage Server.

1. If you install more than one FAStT Storage Server, it is important to give them a literal name. To name or rename the FAStT Storage Server, open the Subsystem Management window. Right-click the subsystem, and click Storage Subsystem → Rename.

2. Because the FAStT Storage Server stores its own event log, synchronize the controller clocks with the time of the host system used to manage the FastT units. If you have not already set the clocks on the Storage Servers, set them now. Be sure that your local system is working using the correct time. Then. click Storage Subsystem → Set Controller Clock.

3. For security reasons, especially if the FAStT Storage Server is directly attached to the network, you should set a password. This password is required for all actions on the FAStT Storage Server that change or update the configuration in any way.

To set a password, highlight the storage subsystem, right-click, and click Change → Password. This password is then stored on the FAStT Storage Server. It is used if you connect through another FAStT Client or the FAStT Field Tool. It does not matter whether you are using in-band or out-of-band management.

Note: Make sure the time of the controllers and the attached systems are synchronized. This simplifies error determination when you start comparing the different event logs. A network time server can be useful for this purpose.



3.3.1 Defining hot-spare drivesHot-spare drives are special, reserved drives that are not normally used to store data. But if a drive in a RAID array with redundancy, such as 1, 5, or 10, fails, the hot-spare drive takes on the function of the failed drive and the data is recovered on the hot-spare drives, which become part of the array. After this procedure, your data is again fully protected. Even if another drive fails, this cannot affect your data.

If the failed drive is replaced with a new drive, the data stored on the hot-spare drive is copied back to the replaced drive, and the original hot-spare drive that is now in use becomes a free hot-spare drive again. The location of a hot-spare drive is fixed and does not wander if it is used.

A hot-spare drive defined on the FAStT Storage Server is always used as a so-called global hot-spare. That is, a hot spare drive can always be used for a failed drive. It is not important in which array or storage enclosure it is situated.

A hot-spare drive must be at least of the capacity of the configured space on the failed drive. The FAStT Storage Server can use a larger drive to recover a smaller failed drive to it. Then, the remaining capacity is blocked.

If you plan to use several hot-spare drives, the FAStT Storage Server uses a certain algorithm to define which hot-spare drive is used. The controller first attempts to find a hot-spare drive on the same channel as the failed drive. The drive must be at least as large as the configured capacity of the failed drive. If a hot-spare drive does not exist on the same channel, or if it is already in use, the controller checks the remaining hot-spare drives, beginning with the last hot-spare configured. For example, the drive in enclosure 1, slot 4, might fail and the hot-spare drives might be configured in the following order:

� HSP 1: enclosure 0, slot 12� HSP 2: enclosure 2, slot 14� HSP 3: enclosure 4, slot 1� HSP 4: enclosure 3, slot 14

In this case, the controller checks the hot-spare drives in the following order:3:14 → 4:1 → 2:14 → 0:12

The controller uses a free hot-spare drive as soon as it finds one, even if there is another one that might be closer to the failed drive.

To define a hot-spare drive, highlight the drive you want to use. From the Subsystem Management window, click Drive → Hot Spare → Assign.

If there are larger drives defined in any array on the FAStT Storage Server than the drive you chose, a warning message appears and notifies you that not all arrays are protected by the hot-spare drive.

The newly defined hot-spare drive then has a small red cross in the lower part of the drive icon.

Especially in large configurations with arrays containing numerous drives, we recommend the definition of multiple hot spare drives, because the reconstruction of a failed drive to a hot spare drive and back to a replaced drive can take a long time. See also 2.3.10, “Hot-spare drive” on page 44.

To unassign a hot-spare drive and have it available again as a free drive, highlight the hot-spare drive and select Drive → Hot Spare → Unassign.


3.3.2 Creating arrays and logical drivesAt this stage, the storage subsystem has been installed and upgraded to the newest microcode level. Now, the arrays and logical drives can be configured. If you are not sure how to divide the available drives into arrays or logical drives and which restrictions apply to avoid improper or inefficient configurations of the FAStT Storage Server, see the IBM Redbook, IBM TotalStorage FAStT600/900 and Storage Manager 8.4, SG24-7010, available at:


To create arrays and logical drives:

1. In the Subsystem Management window, right-click the unconfigured capacity and select Create Logical Drive.

– The Current default host type initially should be set, but uses the last type used from here on.

– You can select the check box to disable this dialog box now. If you want to change Default Host Type later, click Storage Subsystem → Change → Default Host Type in the Subsystem Management window to access the window later.

2. The Create Logical Drive Wizard opens. It leads you through the creation of your logical drives. Choose the RAID level and number of drives that are used in the array, either by the manual option, or the automatic drive selection option. Unless you have a specific need to specify the drives that are being used, select automatic.

3. Define the logical drive. By default, all available space in the array is configured as one logical drive.

– Assign a name to the logical drive.

– If you want to change advanced logical drive settings, such as segment size or cache settings, select the Customize settings option.

The newly created logical drive is not mapped automatically and remains unmapped. Otherwise, the drive is immediately seen by the attached hosts. If you change the mapping later, the logical drive, which appears as a physical drive to the operating system, is removed without notifying the hosts. This can cause severe problems.

4. On the Specify Advanced Logical Drive Parameters panel, define the logical drive exactly to suit your needs:

� For Logical Drive I/O characteristics, you can specify file system, database, or multimedia base. Or you can manually set the parameters for the logical drive by selecting Custom.

� The segment size is chosen according to the usage pattern. For custom settings, you can directly define the segment size.

� You can also define the cache read-ahead multiplier. Begin by choosing only small values. Otherwise, large parts of the cache are filled by read-ahead data that might never be used.

� The preferred controller handles the logical drive normally if both controllers and I/O paths are online. You can load balance your logical drives throughout both controllers. The default is to alternate the logical drives on the two controllers.

� You can choose to set the Logical Drive to LUN mapping parameter to run automatically or to be delayed by mapping it later with storage partitioning. If you choose to map later to the default host group, keep in mind that the logical drive becomes visible immediately after the creation. The recommendation is to never leave LUNs in the default group.



If the logical drive is smaller than the total capacity of the array, a window opens and asks whether you want to define another logical drive on the array. The alternative is to leave the space as unconfigured capacity. After you define all logical drives on the array, the array is now initialized and immediately accessible.

If you left unconfigured capacity inside the array, you can define another logical drive later in this array. Simply highlight this capacity, right-click, and choose Create Logical Drive. Simply follow the steps that we outlined in this section, except for the selection of drives and RAID level. Because you already defined arrays that contain free capacity, you can choose where to store the new logical drive, on an existing array or on a new one.

3.3.3 Configuring storage partitioningWith storage partitioning, heterogeneous hosts can be attached to the FAStT Storage Server. You then need to configure storage partitioning for two reasons:

� Each host operating system requires slightly different settings on the FAStT Storage Server, so you need to tell the storage subsystem the host type that is attached.

� There is interference between the hosts if every host has access to every logical drive. By using storage partitioning and LUN masking, you ensure that each host or host group only has access to its assigned logical drives.

To configure storage partitioning, follow these steps:

1. Select Mappings View in the Subsystem Management window.

2. All information, such as host ports and logical drive mappings, are shown and configured here. The right side of the window lists all mappings that are owned by the object you choose in the left side. If you highlight the storage subsystem, you see a list of all defined mappings. If you highlight a specific host group or host only, its mappings are listed.

3. Define the host groups. Highlight the Default Group, right-click, and select Define Host Group.

4. The hosts groups are defined, and the hosts in these groups can now be defined. Highlight the group for which you want to add a new host. Right-click, select Define Host, and enter your desired host name. It is a good idea to make the host name something that is descriptive to the host that it represents.

If you accidentally assigned a host to the wrong host group, you can move the host to another group. Simply right-click the host name and select Move. A pop-up window opens and asks you to specify the host group name.

5. Because storage partitioning of the FAStT Storage Server is based on the World Wide Names of the host ports, the definitions for the host groups and the hosts only represent a view of the physical and logical setup of your fabric. When this structure is available, it is much easier to identify which host ports are allowed to see the same logical drives and which are in different storage partitions.

Note: If only one server will access the logical disks in a storage partition, it is not necessary to define a host group, because you could use the default host group. However, as requirements are constantly changing, we recommend that you define a host group anyway. Otherwise, the addition of new systems is not possible without disrupting the mappings already defined in the default host group.


6. Storage partitioning is not the only function of the storage server that uses the definition of the host port. When you define the host ports, the operating system of the attached host is defined as well. Through this information, FAStT can adapt the RDAC or ADT settings for the hosts.

It is important to choose the correct operating system from the list of available operating systems, because this is the part of the configuration where you configure the heterogeneous host support. Each operating system expects slightly different settings and handles SCSI commands a little differently. Therefore, it is important to select the correct value. If you do not, your operating system might not boot anymore or path failover cannot be used if connected to the storage subsystem.

The host port is identified by the World Wide Name of the host bus adapter. Highlight the host, right-click, and select Define Host Port. In the Define Host Port dialog box, enter the port name for this adapter and choose the correct operating system. The host port identifier corresponds to the World Wide Name of the adapter port. In the drop-down box, you only see the World Wide Names that are currently active. If you want to enter a host port that is not currently active, type the World Wide Name in the field. Be sure to check for typing errors. If the WWN does not appear in the drop-down box, make sure you verify your zoning for accuracy.

7. Define the mapping for each of logical drives that have been created. All the information entered in the Define Host Port dialog box is needed to ensure the proper operation in a heterogeneous environment with multiple servers attached to the FAStT Storage Server.

Highlight the host group to which you want to map a new logical drive. Right-click and select Define Additional Mapping.

8. In the Define Additional Mapping dialog box, select the logical drive you want to map to this host group and assign the correct LUN number.

a. In the top drop-down list, you can choose the host group or host to which you want to map the logical drive.

b. With the logical unit number, you can influence the order in which the mapped logical drives appear. Starting with LUN 0, the logical drive appears in the operating system.

c. In the list box that follows, you see all unmapped drives. Choose the logical drive you want to map.

If you entered all the information, click Add to finish defining this mapping. The first mapping is now defined. In the Subsystem Management window, you see that the number of used storage partitions changed from 0/64 to 1/64.

You can define all other mappings by repeating these steps. You receive an error message after the last logical drive is mapped to a host group or host.

If you have a single server in a host group that has one or more LUNs assigned to it, it is recommended to assign the mapping to the host and not the host group. All servers having the same host type, for example, all Windows NT servers, can be in the same group if you want, but by mapping at the host level, you can define what specific server accesses what specific LUN.

If you have a cluster, it is good practice to assign the LUNS to the host group so that all of the servers in the host group have access to the LUNs.

Note: If you create a new mapping or change an existing mapping of a logical drive, the change happens immediately. Therefore, make sure that this logical drive is not in use or even assigned by any of the machines attached to the storage subsystem.


Now all logical drives and their mappings are defined and are now accessible by their mapped host systems.

To make the logical drives available to the host systems without rebooting, the FAStT Utilities package provides the hot_add command line tool for some operating systems. You simply run hot_add, and all host bus adapters are re-scanned for new devices, and the devices are assigned within the operating system.

You might have to take appropriate steps to enable the use of the storage inside the operating system, such as formatting the disks with a file system and mounting them.

If you attached a Linux or AIX system to the FAStT Storage Server, you need to delete the mapping of the access LUN. Highlight the host or host group containing the Linux or AIX system in the Mappings View. In the right side of the window, you see the list of all logical drives mapped to this host or host group. To delete the mapping of the access logical drive, right-click it and select Delete. The mapping of the access logical drive is deleted immediately.


Chapter 4. FAStT maintenance tasks

This chapter describes various maintenance functions of FAStT, such as performance monitoring, error reporting, and alerts. It also covers how to use performance monitor data to make decisions regarding the tuning of the storage subsystem.

4


4.1 Performance monitoring and tuningThis section describes the performance monitor and how the data it provides can be used to tune various parameters. It also looks at what settings can be adjusted and what results to expect.

4.1.1 The performance monitorUse the performance monitor option to select logical drives and controllers to monitor or to change the polling interval. To change the polling interval, choose a number of seconds in the spin box. Each time the polling interval elapses, the performance monitor re-queries the storage subsystem and updates the statistics in the table.

If you are monitoring the storage subsystem in real time, update the statistics frequently by selecting a short polling interval, for example, five seconds. If you are saving results to a file to look at later, choose a slightly longer interval, for example, 30 to 60 seconds, to decrease the system overhead and the performance impact.

The performance monitor does not dynamically update its display if any configuration changes (for example, the creation of new logical drives or a change in logical drive ownership) occur while the monitor window is open. The Performance Monitor window must be closed and then reopened for the changes to appear.

Using the performance monitor to collect performance data can affect the normal storage subsystem performance, depending on the polling interval that you set.

If the storage subsystem you are monitoring begins in or transitions to an unresponsive state, an informational dialog box opens, stating that the performance monitor cannot poll the storage subsystem for performance data.

Use the performance monitor data to make storage subsystem tuning decisions, as described in the following sections.

Total I/OsThis data is useful for monitoring the I/O activity of a specific controller and a specific logical drive, which can help identify possible high-traffic I/O areas.

If I/O rate is slow on a logical drive, try increasing the array size.

You might notice a disparity in the Total I/Os (workload) of controllers, for example, the workload of one controller is heavy or is increasing over time, while that of the other controller is lighter or more stable. In this case, consider changing the controller ownership of one or more logical drives to the controller with the lighter workload. Use the logical drive Total I/O statistics to determine which logical drives to move.

If you notice the workload across the storage subsystem (Storage Subsystem Totals Total I/O statistic) continues to increase over time, while application performance decreases, this might indicate the need to add additional storage subsystems to your installation so that you can continue to meet application needs at an acceptable performance level.

Tip: There is no perfect guideline for storage performance optimization that is valid in every environment and for every specific situation. The best way to understand disk I/O and throughput requirements is to monitor an existing system.


Read percentageUse the read percentage for a logical drive to determine actual application behavior. If there is a low percentage of read activity relative to write activity, consider changing the RAID level of an array from RAID-5 to RAID-1 for faster performance.

Cache hit percentageA higher percentage is desirable for optimal application performance. There is a positive correlation between the cache hit percentage and I/O rates.

The cache hit percentage of all of the logical drives might be low or trending downward. This might indicate inherent randomness in access patterns, or at the storage subsystem or controller level, this can indicate the need to install more controller cache memory if you do not have the maximum amount of memory installed.

If an individual logical drive is experiencing a low cache hit percentage, consider enabling cache read ahead for that logical drive. Cache read ahead can increase the cache hit percentage for a sequential I/O workload.

Determining the effectiveness of a logical drive cache read-ahead multiplierTo determine if your I/O has sequential characteristics, try enabling a conservative cache read-ahead multiplier (four, for example). Then, examine the logical drive cache hit percentage to see if it has improved. If it has, indicating that your I/O has a sequential pattern, enable a more aggressive cache read-ahead multiplier (eight, for example). Continue to customize logical drive cache read-ahead to arrive at the optimal multiplier (in the case of a random I/O pattern, the optimal multiplier is zero).

Current KB/sec and maximum KB/secThe transfer rates of the controller are determined by the application I/O size and the I/O rate. Generally, small application I/O requests result in a lower transfer rate, but provide a faster I/O rate and shorter response time. With larger application I/O requests, higher throughput rates are possible. Understanding your typical application I/O patterns can help you determine the maximum I/O transfer rates for a given storage subsystem.

Consider a storage subsystem, equipped with Fibre Channel controllers, that supports a maximum transfer rate of 100 Mbps (100,000 KB per second). Your storage subsystem typically achieves an average transfer rate of 20,000 KB/sec. (The typical I/O size for your applications is 4 KB, with 5,000 I/Os transferred per second for an average rate of 20,000 KB/sec.) In this case, I/O size is small. Because there is system overhead associated with each I/O, the transfer rates will not approach 100,000 KB/sec. However, if your typical I/O size is large, a transfer rate within a range of 80,000 to 90,000 KB/sec might be achieved.

Current I/O per second and maximum I/O per secondFactors that affect I/Os per second include access pattern (random or sequential), I/O size, RAID level, segment size, and number of drives in the arrays or storage subsystem. The higher the cache hit rate, the higher the I/O rates.

Performance improvements caused by changing the segment size can be seen in the I/Os per second statistics for a logical drive. Experiment to determine the optimal segment size, or use the file system or database block size.

Higher write I/O rates are experienced with write caching enabled compared to disabled. In deciding whether to enable write caching for an individual logical drive, consider the current and maximum I/Os per second. You should expect to see higher rates for sequential I/O patterns than for random I/O patterns. Regardless of your I/O pattern, it is recommended that write caching be enabled to maximize I/O rate and shorten application response time.

Chapter 4. FAStT maintenance tasks 79

4.1.2 Tuning cache parametersFor details on this topic, see 2.3.9, “Cache parameters” on page 40.

4.2 Controlling the performance impact of maintenance tasksFrom time to time, you may need to run maintenance or performance tuning operations, or you might have a requirement to run VolumeCopy or Remote Volume Mirroring operations.

All of these operations are considered business as usual, but do, however, affect the system performance. They run as background tasks, controlled by the storage subsystem firmware. The storage subsystem controls the sharing of resources between the background task and host systems I/O activity.

To help you control the impact of background tasks, the system enables you to set the priority of the background tasks. This setting effectively adjusts the ratio of resources that are allocated between the host system I/O and the background operations.

This becomes a trade-off between having the background operation complete in the fastest possible time and potentially impact host system performance, or having minimal impact on the host system performance but taking more time to complete the background task.

These background tasks are categorized in the following sections.

4.2.1 Modification operationsA modification operation is a controller-based operation, where the controller is required to write and, in some cases, rewrite data to arrays, logical drives, and disk drives.

The modification priority defines how much processing time is allocated for logical drive modification operations relative to system performance. To learn more about setting the modification priority, refer to the IBM Redbook, IBM TotalStorage FAStT600/900 and Storage Manager 8.4, SG24-7010, available at:


Modification operations include the following actions:

� Defragmenting an array:

A fragmented array can result from logical drive deletion or from not using all available free capacity in a Free Capacity node the during logical drive creation.

Because new logical drives cannot spread across several free space nodes, the logical drive size is limited to the greatest free space node available, even if there is more free space in the logical drive. The array needs to be defragmented first to consolidate all free space nodes to one free space node for the array. Then, a new logical drive can use the whole available free space.

Use the defragment option to consolidate all free capacity on a selected array. The defragmentation runs concurrently with normal I/O; it impacts performance, because the data of the logical drives must be moved within the array. Depending on the array configuration, this process continues to run for a long period of time. After the procedure is started, it cannot be stopped. During this time, no configuration changes can be performed on the array.



The defragmentation done on the FAStT Storage Server only applies to the free space nodes on the array. It is not connected to a defragmentation of the file system used by the host operating systems in any way.

� Copyback:

Copyback refers to the process of copying data from a hot-spare drive (used as a standby in case of possible drive failure) to a replacement drive. When you physically replaced the failed drive, a copyback operation automatically occurs from the hot-spare drive to the replacement drive.

� Initialization:

This is the deletion of all data on a drive, logical drive, or array. In previous versions of the storage management software, this was called format.

� Dynamic Segment Sizing (DSS):

Dynamic Segment Sizing (DSS) describes a modification operation where the segment size for a select logical drive is changed to increase or decrease the number of data blocks that the segment size contains. A segment is the amount of data that the controller writes on a single drive in a logical drive before writing data on the next drive.

� Dynamic Reconstruction Rate (DRR):

Dynamic Reconstruction Rate (DRR) is a modification operation where data and parity within an array are used to regenerate the data to a replacement drive or a hot spare drive. Only data on a RAID-1, -3, or -5 logical drive can be reconstructed.

� Dynamic RAID Level Migration (DRM):

Dynamic RAID Level Migration (DRM) describes a modification operation used to change the RAID level on a selected array. The RAID level selected determines the level of performance and parity of an array.

� Dynamic Capacity Expansion (DCE):

Dynamic Capacity Expansion (DCE) describes a modification operation used to increase the available free capacity on an array. The increase in capacity is achieved by selecting unassigned drives to be added to the array. After the capacity expansion is completed, additional free capacity is available on the array for the creation of other logical drives. The additional free capacity can then be used to perform a Dynamic Logical Drive Expansion (DVE) on a standard or FlashCopy repository logical drive.

� Dynamic Logical Drive Expansion (DVE):

Dynamic Logical Drive Expansion (DVE) is a modification operation used to increase the capacity of a standard logical drive or a FlashCopy repository logical drive. The increase in capacity is achieved by using the free capacity available on the array of the standard or FlashCopy repository logical drive.

The modification priority rates are lowest, low, medium, high, and highest.

4.2.2 Remote Volume Mirroring operationsWhen a storage subsystem logical drive is a primary logical drive and a full synchronization is necessary, the controller owner performs the full synchronization in the background while processing local I/O writes to the primary logical drive and associated remote writes to the secondary logical drive. Because the full synchronization diverts controller processing

Note: The lowest priority rate favors system performance, but the modification operation takes longer. The highest priority rate favors the modification operation, but system performance can be compromised.


resources from I/O activity, it can impact performance on the host application. The synchronization priority defines how much processing time is allocated for synchronization activities relative to system performance.

The synchronization priority rates are lowest, low, medium, high, and highest.

The following guidelines roughly approximate the differences between the five priorities. Logical drive size and host I/O rate loads affect the synchronization time comparisons:

� A full synchronization at the lowest synchronization priority rate takes approximately eight times as long as a full synchronization at the highest synchronization priority rate.

� A full synchronization at the low synchronization priority rate takes approximately six times as long as a full synchronization at the highest synchronization priority rate.

� A full synchronization at the medium synchronization priority rate takes approximately three and a half times as long as a full synchronization at the highest synchronization priority rate.

� A full synchronization at the high synchronization priority rate takes approximately twice as long as a full synchronization at the highest synchronization priority rate.

The synchronization progress bar at the bottom of the Mirroring tab of the Logical Drive Properties dialog box displays the progress of a full synchronization.

4.2.3 VolumeCopy priority ratesSeveral factors contribute to system performance, including I/O activity, logical drive RAID level, logical drive configuration (number of drives in the array or cache parameters), and logical drive type (FlashCopy logical drives might take more time to copy than standard logical drives).

You can select the copy priority when you are creating a new logical drive copy, or you can change it later using the Copy Manager. The copy priority rates are lowest, low, medium, high, and highest.

4.2.4 FlashCopy operationsIf you no longer need a FlashCopy logical drive, you might want to disable it. As long as a FlashCopy logical drive is enabled, your storage subsystem performance is impacted by the copy-on-write activity to the associated FlashCopy repository logical drive. When you disable a FlashCopy logical drive, the copy-on-write activity stops.

If you disable the FlashCopy logical drive instead of deleting it, you can retain it and its associated repository. Then, when you need to create a different FlashCopy of the same base logical drive, you can use the re-create option to reuse a disabled FlashCopy. This takes less time.

Note: The lowest priority rate favors system performance, but the full synchronization takes longer. The highest priority rate favors full synchronization, but system performance can be compromised.

Note: The lowest priority rate supports I/O activity, but the logical drive copy takes longer. The highest priority rate supports the logical drive copy, but I/O activity can be affected.


4.3 Event monitoring and alertsIncluded in the FAStT Client package is the Event Monitor service. It enables the host running this monitor to send out alerts by e-mail (SMTP) or traps (SNMP). The Event Monitor can be used to alert you of problems in any of the FAStT Storage Servers in your environment.

Depending on the setup you choose, different storage subsystems are monitored by the Event Monitor. If you right-click your local system in the Enterprise Management window (at the top of the tree) and select Alert Destinations, this applies to all storage subsystems listed in the Enterprise Management window. Also, if you see the same storage subsystem through different paths, directly attached and through different hosts running the host agent, you receive multiple alerts. If you right-click a specific storage subsystem, you only define the alerting for this particular FAStT Storage Server.

An icon in the lower-left corner of the Enterprise Management window indicates that the Event Monitor is running on this host.

If you want to send e-mail alerts, you have to define an SMTP server first. Click Edit → Configure Mail Server. Enter the IP address or the name of your mail server and the sender address.

In the Alert Destination dialog box, you define the e-mail addresses to which alerts are sent. If you do not define an address, no SMTP alerts are sent. You also can validate the e-mail addresses to ensure a correct delivery and test your setup.

If you choose the SNMP tab, you can define the settings for SNMP alerts: the IP address of your SNMP console and the community name. As with the e-mail addresses, you can define several trap destinations.

You need an SNMP console for receiving and handling the traps sent by the service. There is an MIB file included in the Storage Manager software, which should be compiled into the SNMP console to allow proper display of the traps. Refer to the documentation of the SNMP console you are using to learn how to compile a new MIB.

4.3.1 FAStT Service AlertFAStT Service Alert is a feature of the IBM TotalStorage FAStT Storage Manager that monitors system health and automatically notifies the IBM Support Center when problems occur. Service Alert sends an e-mail to a call management center that identifies your system and captures any error information that can identify the problem. The IBM support center analyzes the contents of the e-mail alert and contacts you with the appropriate service action.

Service offering contractTo obtain a service offering contract:

1. The account team submits a request for price quotation (RPQ) requesting Service Alert, using the designated country process.

2. The IBM TotalStorage hub receives the request and ensures that the prerequisites are met, such as these:

– The machine type, model, and serial number are provided.

Tip: The Event Monitor service should be installed and configured on at least two systems that are attached to the storage subsystem and allow in-band management, running 24 hours a day. This practice ensures proper alerting, even if one server is down.


– The FAStT Storage Server management station is running Storage Manager Client Version 8.3 or higher

– The FAStT Storage Server firmware level is appropriate.

– The FAStT Storage Server management station has Internet access and e-mail capability.

– Willingness to sign the contract with the annual fee is indicated.

3. After the prerequisites are confirmed, the service offering contract is sent.

4. When the contract has been signed, the approval is sent from the IBM TotalStorage hub, with the support team copied.

5. Billing is sent at the start of the contract.

Activating FAStT Service AlertTo activate Service Alert, complete the following tasks:

1. Create a user profile (userdata.txt).2. Rename each storage subsystem and synchronize the controller clock.3. Configure the e-mail server.4. Configure the alert destination.5. Validate the installation.6. Test the system.

Creating a user profileThe user profile (userdata.txt) is a text file that contains your individual contact information. It is placed at the top of the e-mail that Service Alert generates. A template is provided, which you can download and edit using any text editor.

Perform the following steps to create the user profile:

1. Download the userdata.txt template file from one of the following Web sites:

http://www.ibm.com/storage/fastX00 Where X00 represents the appropriate FAStT model (200, 500, 700, or 900).

The userdata.txt template is named userdata.txt.

2. Enter the required information. There should be seven lines of information in the file. The first line should always be “Title: IBM FAStT Product”. The other lines contain the company name, company address, contact name, contact phone number, alternate phone number, and machine location information. Do not split the information for a given item, for example, do not put the company address on multiple lines. Use only one line for each item.

The Title field of the userdata.txt file must always be “IBM FAStT Product”. The rest of the fields should be completed for your specific FAStT Storage Server installation.

Important: The user profile file name must be userdata.txt. The file content must be in the format as described in step 2. In addition, the file must be placed in the appropriate directory in the FAStT Storage Server management station as indicated in step 4.

Note: When you type in the text for the userdata.txt file, the colon (:) is the only legal separator between the required label and the data. No extraneous data is allowed (blanks, commas, and so on) in the label unless specified. Labels are not case sensitive.


See Example 4-1 for an example of a completed userdata.txt user profile.

Example 4-1 Sample userdata.txt

Title: IBM FAStT Product Company name: IBM (73HA Department) Address: 3039 Cornwallis Road, RTP, NC 27709 Contact name: John Doe Contact phone number: 919-254-0000 Alternate phone number: 919-254-0001 Machine location: Building 205 Lab, 1300

3. Save the userdata.txt file in ASCII format.

4. Store the userdata.txt file in the appropriate subdirectory of the FAStT Storage Server management station, depending on the operating system that is installed in the management station:

– For Microsoft Windows 2000 and Windows NT4, store the userdata.txt file in the %SystemRoot%\java\ directory if Event Monitor is installed, or if Event Monitor is not installed, in the Installed_Windows_driveletter:\Documents and Settings\Current_login_user_folder directory.

If your Windows 2000 or Windows NT4 installation uses the default installation settings, and the current login user ID is Administrator, the directories are c:\WINNT\java or c:\Documents and Settings\Administrator, respectively.

– For AIX, store the userdata.txt file in the / directory.

– For Red Hat Advanced Server, store the userdata.txt file in the default login directory of the root user. In a normal installation, this directory is /root.

– For SuSE 8, store the userdata.txt file in the default login directory of the root user. In a normal installation, this directory is /root.

– For Novell NetWare, store the userdata.txt file in the sys:/ directory.

– For Solaris, store the userdata.txt file in the / directory.

– For HP-UX, store the userdata.txt file in the / directory.

– VMware ESX servers that are connected to a FAStT Storage Server require a separate workstation for FAStT Storage Server management. Service Alert is only supported in a VMware ESX and FAStT environment by way of the remote management station.

Renaming the FAStT subsystem and synchronizing the controller clockWhen you register for Service Alert, you must change the existing node ID of each FAStT Storage Server. Service Alert uses this new name to identify which FAStT Storage Server has generated the problem e-mail. To rename the Storage Server, refer to 3.1.2, “Installing and starting the FAStT Storage Manager Client” on page 59. Before you rename the storage subsystem, record the FAStT Storage Server machine type, model, and serial number.

To rename the FAStT subsystem and synchronize the controller clock:

1. Enter the new name for the subsystem. You must use the following naming convention for the new name. Any errors in the format of the new name can result in delays or denial of IBM service support. The new name cannot contain more than 30 characters. The format for the new name is:

ttttmmm/sssssss#cust_nodeid_reference

Note: You must have a Storage Manager Client session running to monitor failures of the FAStT Storage Server.


Where:

– tttt is the 4-digit IBM machine type of the product.– mmm is the 3-digit IBM model number for the product.– / is the required separator.– sssssss is the 7-digit IBM serial number for the machine.– # is the required separator.– cust_nodeid_reference is the node ID as referenced by the customer.

Use the information provided in Table 4-1 as a reference list of FAStT machine types and model numbers.

Table 4-1 FAStT Storage Server machine and model numbers

Following are some examples of storage subsystem names:

1742900/23A1234#IBM_Eng 1742000/23A1235#IBM_Acctg 3552000/23A1236#IBM_Mktg 3542000/23A1237#IBM_Mfg

2. Click OK to save the new name.

3. To synchronize the controller clock with the time in the FAStT Storage Server management station that monitors the alerts, refer to 3.1.2, “Installing and starting the FAStT Storage Manager Client” on page 59.

This step is optional. If performed, it facilitates the troubleshooting session, because the time that the alert e-mail is sent is about the same as the time that the errors occurred in the FAStT Storage Server.

The steps in “Creating a user profile” on page 84 and “Renaming the FAStT subsystem and synchronizing the controller clock” on page 85 must be performed for each of the FAStT Storage Servers that support Service Alert.

Configuring the e-mail serverYou must configure your e-mail server to enable it to send alerts. Refer to 4.3, “Event monitoring and alerts” on page 83 for instructions about how to do this.

The e-mail address you enter is used to send all alerts.

Configuring the alert destinationRefer to 4.3, “Event monitoring and alerts” on page 83 for instructions about how to do this.

Important: No extra characters are allowed before the "#" separator.

Product Machine type Model number Model number in your name

FAStT900 1742 90U, 90X 900

FAStT700 1742 1RU, 1RX 000

FAStT500 3552 1RU, 1RX 000

FAStT200 3542 1RU, 1RX2RU, 2RX

000


In the E-mail address text box, enter either one of the following e-mail addresses, depending on your geographic location:

� For EMEA and A/P locations: [email protected]� For North America locations: [email protected]� For South and Central America, and Caribbean Island locations: [email protected]

Validating the Service Alert installationMake sure that the FAStT Event Monitor service is installed in the management station. If it is not installed, you must uninstall the FAStT Storage Manager Client and reinstall it with the Event Monitor service enabled.

Testing the Service Alert installationAfter all previous tasks are completed, you are ready to test your system for Service Alert.

Call your IBM Support Center. Tell the representative that you are ready to test the Service Alert process. The IBM representative will work with you to test your system setup and ensure that FAStT Service Alert is working properly.

A test that you will perform, with the help of the Support Center, is to manually fail a non-configured drive in the FAStT Storage Server using the FAStT Storage Manager Client. If all of the drives are configured, you can turn off a redundant power supply in the FAStT Storage Server or FAStT expansion enclosure. When the drive fails or the power supply is turned off, a Service Alert is sent to the IBM e-mail address that you specified in “Configuring the e-mail server” on page 86.

4.4 Saving the subsystem profileConfiguring a FAStT Storage Server is a complex task. Therefore, the so-called subsystem profile is a single location where all the information on the configuration is stored. The profile includes information about the controllers, attached drives and enclosures, their microcode levels, arrays, logical drives, and storage partitioning.

To obtain the profile, open the Subsystem Management window and click View → Storage Subsystem Profile.

Note: The FAStT Event Monitor service is not supported on Novell Netware 6. You must use a management station with other operating systems installed, such as Windows 2000.

Note: Do not turn off the power supply if this is the only one that is powered on in your storage server or expansion enclosure. Turning off the power supply is the preferred test, because it allows the testing of the FAStT Storage Server Event Monitor service. This service monitors the FAStT Storage Server for alerts without needing to have the FAStT Storage Manager Client running in the root user session.

Tip: You should save a new profile each time you change the configuration of the FAStT storage subsystem even for minor changes. This applies to all changes regardless of how minor they might be. The profile should be stored in a location where it is available even after a complete configuration loss, for example, after a site loss.


4.5 Upgrades and maintenanceEvery so often IBM will release new firmware (it is posted on the support the Web site) that will need to be installed. Occasionally, IBM may remove old firmware versions from support. Upgrades from unsupported levels are mandatory to receive warranty support.

This section reviews the required steps to upgrade your IBM TotalStorage FAStT Storage Server when firmware updates or fixes, or both, become available. Upgrades to the FastT Storage Server firmware should be accompanied by an upgrade to the latest available version of the Storage Manager client software. It is possible to manage a FAStT Storage Server running down-level firmware with the latest SMclient, but not possible to manage a Storage Server running the latest version of firmware with a down-level client.

4.5.1 Being up-to-date with your drivers and firmware using My supportMy support registration provides E-mail notification when new firmware levels have been updated and are available for download and installation. To register for My support:

1. Visit the Web at:

http://www.ibm.com/support/mySupport

2. Under Personalized support, click on My support.

3. Under We use IBM Registration, click on Register (if you are not registered yet).

4. Fill in the information required for registration. Items with an asterisk (*) are required fields.

5. Go back to Login. Enter your User ID and Password and click Go to access My support.

6. Enter the information required for your My support profile.

a. Under Select a product family:, select Computer Storage and click on Go.

b. Under Disk Storage, select TotalStorage FAStT Storage Server, or any other desired products.

c. Click Save & return at the bottom of the page.

d. Review your profile for correctness.

e. Under Select mail preferences, check the boxes labeled Flashes and Downloadable files, and click Submit.

f. Under Welcome, (your name), click on Sign out to end your session.

You will be notified whenever there is new firmware available for the products you selected during registration.

We also suggest you explore and customize to your needs the other options available under My support.

4.5.2 Prerequisites for upgradesUpgrading the firmware and management software for the FAStT Storage Server is a relatively simple procedure. Before you start, you should make sure that you have an adequate maintenance window to do the procedure, because on large configurations it can

Note: The version number of the Storage Manager firmware and the Storage Manager client are not completely connected. For instance, even if you are running Storage Manager 8.3 firmware, you should run the latest version of the Storage Manager client on your host. (At the time of writing, this was 8.4.)


http://www.ibm.com/support/mySupport

be a little time consuming. The times for upgrading all the associated firmware and software are in Table 4-2. These times are only approximate and can vary from system to system.

Table 4-2 Upgrade times

It is critical that if you update one part of the firmware, you update all the firmware and software to the same level. You must not run a mismatched set.

All the necessary files for performing this upgrade are available at:

http://www.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-4JTS2T

Look for Fibre Channel Solutions.

4.5.3 Updating FAStT host softwareThis section describes how to update the FAStT software in Windows and Linux environments.

Updating in a Windows environmentTo update the host software in a Windows environment:

1. Uninstall the storage management components in the following order:

a. SMagentb. SMutilc. RDACd. SMclient

2. Verify that the IBM host adapter device driver versions are current. If they are not current, refer to the readme file located with the device driver and then upgrade the device drivers.

3. Install the storage manager components in the following order:

a. RDACb. SMagentc. SMutild. SMclient

Updating in a Linux environmentTo update the host software in a Linux environment:

1. Uninstall the storage manager components in the following order:

a. FAStT Runtime environmentb. SMutilc. RDACd. SMclient

2. Verify that the IBM host adapter device driver versions are current. If they are not current, refer to the readme file located with the device driver and then upgrade the device drivers.

Element being upgraded Approximate time of upgrade

Storage Manager software and associated drivers and software

35 minutes

FAStT Storage Server firmware 5 minutes

FAStT ESM firmware 5 minutes per ESM

Hard drives 3 minutes per drive


http://www.ibm.com/pc/support/site.wss/document.do?lndocid=MIGR-4JTS2T

3. Install the storage manager components in the following order:

a. RDACb. FAStT Runtime environmentc. SMutild. SMclient

4.5.4 Updating microcodeUpdating the firmware of your FAStT Storage Server will be required from time to time to keep up with the latest fixes and enhancements. For full instructions about how this is done, refer to the documentation that is supplied with the firmware.

You should update the microcode in the following order:

1. Controller firmware/NVSRAM2. ESM firmware3. Hard drive firmware

4.6 Capacity upgrades, system upgradesThe FAStT has the ability to accept new disks and/or EXP units dynamically, with no downtime to the FAStT unit. In fact, the FAStT must be powered on when adding new hardware.

4.6.1 Capacity upgrades and increased bandwidthWe briefly describe here the basic process to follow when adding expansion enclosures, either to increase capacity or bandwidth.

After physical installation, use Storage Manager to create new arrays/LUNs, or extend existing arrays/LUNs. (Note: Some operating systems may not support dynamic LUN expansion.)

Adding capacityWe suppose that the equipment has been mounted and that each tray has been given a unique identifier. The drive bays are empty and drives will be added after the loop is completed.

To add an expansion enclosure:

1. First, add a new expansion (EXP3) on drive loop A by connecting a new cable (numbered 1 in Figure 4-1).

2. Move the cable on loop B from EXP2 to EXP3, as indicated by the cable labeled 2 in the diagram.

Important: Prior to physically installing new hardware, refer to the instructions in the Fibre Channel Hard Drive and Storage Expansion Enclosure Installation and Migration Guide, GC26-7639, available at:

http://www.ibm.com/support/docview.wss?uid=psg1MIGR-55466

Failure to consult this documentation may result in data loss, corruption, or loss of availability to your storage


http://www.ibm.com/support/docview.wss?uid=psg1MIGR-55466

3. Add a cable from EXP2 to EXP3 (as indicated by the cable labeled 3) to complete loop B.

Figure 4-1 Capacity scaling

Increasing bandwidthYou can increase bandwidth by moving expansion enclosures to a new or unused mini hub pair (this doubles the drive-side bandwidth).

This reconfiguration can also be accomplished with no disruption to data availability or interruption of I/O.

Let’s assume that the initial configuration is the one depicted on the left in Figure 4-2. We are going to move EXP2 to the unused mini hub pair on the FAStT900.

Figure 4-2 Increasing bandwidth

Drive Loop A Drive Loop B

FAStT900

EXP1

EXP2

EXP3


FAStT900

EXP1

EXP2

EXP3

1

2

3


FAStT900

EXP1

EXP2

Drive Loop A

Drive Loop D

FAStT900

EXP1

EXP2

1

Drive Loop B

2

3

Drive Loop C

a

bc


To move EXP2 to the unused mini hub pair, proceed as follows:

1. Remove the drive loop B cable between the second mini hub and EXP2 (cable labeled a). Move the cable from EXP2 going to EXP1 (cable labeled b, from loop B) and connect to second mini hub from EXP1 (cable labeled 1).

2. Connect a cable from the fourth mini hub to EXP2, establishing drive loop D (represented by cable labeled 2).

3. Remove the drive loop A cable between EXP1 and EXP2 (cable labeled c) and connect a cable from the third mini hub to EXP2, establishing drive loop C (represented by the cable, labeled 3).

Other considerations when adding expansion enclosuresFigure 4-3 is a diagram of an EXP700 ESM board, showing how the drives are arranged in a Fibre Channel Arbitrated Loop (FC-AL).

A Fibre Channel loop supports 127 addresses. This means the FAStT900 can support up to 8 EXP700 expansion enclosures, or 11 EXP500 expansion enclosures per drive loop, for a total of 112 or 110 drives being addressed.

A Fibre Channel loop supports 127 addresses. This means the FAStT900 can support up to 8 EXP700 expansion enclosures, or 11 EXP500 expansion enclosures per drive loop, for a total of 112 or 110 drives being addressed.

Figure 4-3 EXP700 ESM board diagram

Because two fully redundant loops can be set, you can connect up to 16 EXP700 expansion enclosures or 22 EXP500 expansion enclosures, for a total of up to 224 disk drives (if using the EXP700) or 220 disk drives (if using the EXP500) without a single point of failure.

Expansion enclosure ID (Tray ID)It is very important to correctly set the tray ID switches on the EXPs. They are used to differentiate multiple EXP enclosures that are connected to the same FAStT Storage Server. Each EXP must use a unique value. The FAStT Storage Manager uses the tray IDs to identify each EXP700 enclosure.

Additionally, the Fibre Channel loop ID for each disk drive is automatically set according to:

� The EXP bay where the disk drive is inserted� EXP ID setting

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

DRIVE

Loop A

Loop B

SFP

SFP

SFP

SFP

ESM A

ESM B

In

Out

In

Out


It is important to avoid Hard ID contention (two disks having the same ID on the loop). Such contention can occur when units digit of two drive expansions on the same drive side loop are identical, for example, expansion with IDs 0 and 10, 4 and 14, and 23 and 73, could all have Hard ID contention between devices. See also 3.2.3, “Expansion unit numbering” on page 70.

Do not intermix EXP500 and EXP700It is recommended that you use an EXP700 with the FAStT900. If any EXP500 is connected into the loop, you should manually set the 1 Gbps speed switch to force all the devices and hosts connected to this FAStT900 to work at 1 Gbps speed. With some EXP700 units, the switch is behind a cover plate screwed to the back of the unit. (The plate prevents the switch from being moved inadvertently. Remember also that changing the speed of the drive loop, requires a shutdown of the FAStT Storage Server.

It is not considered good practice to mix EXP700s with EXP500s; this should be avoided whenever possible.

Where intermixing is unavoidable, follow these guidelines:

� Limit the number of expansion enclosures to 8

� When setting the EXP ID of an EXP500, use unit digits 0 to 7 only (this is because drives in EXP number 8 and 9 for an EXP500, have the same preferred hard ID as the drives in slots 11 thru 14 of an EXP700.

� Limit the number of drives installed in the FAStT to 80% of maximum capacity. In an intermix environment, with a maximum number allowed of 224, 180 drives would be the maximum number recommended.

4.6.2 System upgrade and disk migration proceduresThe procedures to migrate disks and arrays or upgrade to a newer FAStT controller are not particularly difficult, but care must be taken to ensure that data is not lost. The checklist for ensuring data integrity and the complete procedure for performing capacity upgrades or disk migration is beyond the scope of this book. Users MUST consult the Fibre Channel Hard Drive and Storage Expansion Enclosure Installation and Migration Guide, GC26-7639, available at:


Here, we explain the FAStT feature that makes it easy for upgrading subsystems and moving disk arrays. This feature is known as DACstore.

What is DACstore?DACstore is an area on each drive in a FAStT storage array where configuration information is stored. This 512MB reservation (as pictured in Figure 4-4) is invisible to a user and contains information about the FAStT configuration.



Figure 4-4 The DACstore area of a FAStT disk drive

The standard DACstore on every drive stores:

� Drive state and status� WWN of the FAStT controller blade behind which the disk resides� Logical drives contained on the disk

Some drives also store extra global controller and subsystem level information, these are called sundry drives. The FAStT controllers will assign one drive in each array as a sundry drive, although there will always be a minimum of three sundry drives even if only one or two arrays exist.

Additional information stored in the DACstore region of the sundry drive:

� Failed drive information� Global Hot Spare state/status� Storage subsystem identifier (SAI or SA Identifier)� SAFE Premium Feature Identifier (SAFE ID)� Storage subsystem password� Media scan rate� Cache configuration of the storage subsystem� Storage user label� MEL logs� LUN mappings, host types, and so on.� Copy of the controller NVSRAM

Why DACstore?This unique feature of FAStT storage servers offers a number of benefits:

� Storage system level reconfiguration. Drives can be rearranged within a storage system to maximize performance and availability through channel optimization

� Low risk maintenance. If drives or disk expansion units are relocated, there is no risk of data being lost. Even if a whole FAStT controller needed to be replaced, all of the data and the controller configuration could be imported from the disks

� Data intact upgrades and migrations. All controllers recognize configuration and data from other controllers so that migrations can be for the entire disk subsystem as shown in Figure 4-5, or for array-group physical relocation as illustrated in Figure 4-6.

DACstore


Figure 4-5 Upgrading FAStT controllers

Figure 4-6 Migrating RAID arrays

4.6.3 Other considerations when adding expansion enclosures and drivesThese are some recommendations to keep in mind:

� If new drawers have been added to the FAStT, it is recommended that some downtime be eventually scheduled to re-distribute the physical drives in the array.

� Utilizing the DACstore function available with the FAStT, drives can be moved from one slot (or drawer, or loop) to another with no effect on the data contained on the drive. This must, however, be done while an array is offline.

� When adding drives to an expansion unit, do not add more than two drives at a time

� For maximum resiliency in the case of failure, arrays should be spread out among as many EXP units as possible. If you merely create a 14-drive array in a new drawer every time you add an EXP 700 full of disk, all of the traffic for that array will be going to that one tray. This can effect performance and redundancy (see also “Channel protection planning” on page 34).

� For best balance of LUNs and IO traffic, drives should be added into expansion units in pairs. In other words, every EXP should contain an even number of drives, not an odd number such as 5.

FAStT900

Data intact upgrade

FAStT700

Data intact migration

FAStT600

Configured array

FAStT900

Configured array


� If you are utilizing two drive loop pairs, approximately half of the drives in a given array should be on each loop pair. In addition, for performance reasons, half of the drives in an array should be in even numbered slots, and half in odd-numbered slots within the EXP units. (The slot number affects the default loop for traffic to a drive.)

� To balance load among the two power supplies in an EXP, there should also be a roughly equal number of drives on the left and right hand halves of any given EXP. In other words, when adding pairs of drives to an EXP, add one drive to each end of the EXP.

The complete procedure for drive migration is given in the Fibre Channel Hard Drive and Storage Expansion Enclosure Installation and Migration Guide, GC26-7639.


Part 2 Advanced topics

In this part of the book, we describe several advanced topics. First, we explain a technique for migrating from other forms of disk storage, such as the 7133 SSA disk, to FAStT in an AIX environment. As a better alternative, we then present Piper, an IBM service offering for transparent, hardware based migration.

Finally, we have included information on High Availability Cluster Multiprocessing (HACMP) and the General Parallel File System (GPFS) as they relate to FAStT storage when attached to AIX hosts.

Part 2


Chapter 5. Migrating 7133 to FAStT in AIX

This chapter explains a procedure to migrate, in an AIX environment, from a 7133 System Storage Architecture (SSA) disk storage solution to FAStT storage.

The procedure essentially uses functions and features of the AIX Logical Volume Manager. It could also thus apply to migration from other devices. Migration options include online migration (while the application is running) and migration requiring downtime.

Prior to explaining the specifics of the migration procedure, we examine performance and sizing considerations to help choose the appropriate FAStT system, the correct number of disks, and an appropriate number of Fibre Channel (FC) adapters. This also includes information for users who have HACMP and SSA Fiber Extenders installed.

The intended audience includes IT managers who plan to replace their SSA systems, and skilled AIX administrators who will actually be doing the configuration of the FAStT and the data migration. Assuming appropriate AIX and storage administration skills, this procedure is an alternative to the IBM Piper Lite service offering (see Chapter 6, “IBM migration services” on page 119).

5

Note: This chapter is based on a paper authored by Dan Braden, Technical Support Marketing - ATS, America.

Important: Whenever you consider system changes, whether small or large, proper change control must be exercised. At least, ensure that full backups have been taken and the restore has been tested.


5.1 Performance sizing considerationsObviously, when migrating from 7133 SSA disks to a FAStT solution, the new solution should be designed and sized such that it will perform as well as or better than the current SSA configuration.

However, there are several ways of looking at performance. Most important, usually, is the application performance, but more meaningful from a migration perspective is disk subsystem performance. The metrics are different for these. Application performance generally refers to number of concurrent users, maximum transactions per second, batch job run time, and so on. Disk subsystem performance generally is split into random and sequential IO workloads, with the metrics being IOPS (IOs per second) for random workloads and MB/s for sequential workloads.

There are essentially three approaches one may take to size the FAStT storage subsystem to replace the SSA subsystem:

� Size a disk solution based upon the SSA configuration.� Size a disk solution based upon the application IO rates.� Size a disk solution that provides more performance.

Keep in mind as well that application performance is directly impacted by the time it takes to complete IOs (the IO service time). If your application requires very high performance, minimizing IO service times may be important.

Typically, for either single disks or subsystems, the IO service time increases, at first slightly when IOPS increase and up to a point where it then increases rapidly. Observations show that each physical disk in a FAStT subsystem can perform 150 IOPS with a reasonable response time.

5.1.1 SSA adapters and FAStT adapters performance comparisonFor reference and comparison, we have included in this section several tables that summarize performance related characteristics for SSA adapters and FAStT adapters used in an AIX environment.

SSA adapters SSA adapters support an optional fast write cache (FWC) of 32 MB, while the FAStT has considerably more cache from 128 MB to 2 GB, depending upon the FAStT model.

Table 5-1 and Table 5-2 summarize the SSA adapter bandwidth.

Table 5-1 Maximum number of highly active disks per adapter

JBOD 96

RAID 5 48

RAID 1 88

RAID 10 72

LVM Mirroring 88


Table 5-2 Max IOPS for various configurations

The conclusion is that use of RAID, FWC, or two-way FWC reduces the adapter’s maximum IOPS bandwidth. This document assumes that you are using the FC 6225 or 6230 SSA adapters (the Advanced Serial RAID adapter). If you have one of the older adapters, just assume that you are using the 6225/6230 adapter, since they have the best SSA adapter bandwidth.

More detailed information is available in the Advanced Serial RAID Adapter Planning Guide available at:

http://www.storage.ibm.com/hardsoft/products/ssa/docs/pdfs/ssaplan.pdf

FAStT FC adapterTypically, users will attach the FAStT to pSeries via the FC 6239 (Fibre Channel) adapter. Some users will have the older FC 6227 or 6228 adapters (these have been withdrawn from marketing). Note that the 6227 is a 1 Gbps (about 100MB/s) adapter, while the 6228 and 6239 are 2 Gbps (about 200 MB/s) FC adapters. This is the architectural bandwidth of the interconnect; however, the sustained throughput of the adapters is different, and can be measured for either sequential (MB/s) or random (IOPS) workloads. Table 5-3 shows the FC adapters sustainable bandwidths:

Table 5-3 Adapter bandwidth

Note that if the system connects through a SAN switch that operates at 1 Gbps, the maximum sustained throughputs the stated for the two Gbps adapters should be divided by 2.

Given the sustainable throughputs for FC adapters compared to SSA adapters, we need generally fewer FC adapters than SSA adapters to provide equivalent or better performance.

Installing two FC 6239 adapters will provide availability along with adequate bandwidth in most cases. The exceptions will be configurations that used four SSA adapters and had high bandwidth utilization, or when using existing 6227 or 6228 adapters as replacements for the SSA adapters.

1-way loop 1-way loop 2-way loop 2-way loop

70% reads No FWC

70% reads FWC

70% reads No FWC

70% reads FWC

RAID 5 2,700 2,000 4,200 3,000

RAID 1 4,800 3,300 6,100 4,000

RAID 10 5,200 3,500 6,200 4,400

Adapter Feature Code

Max 4 KB IOPS

Max MB/s

Gigabit Fibre Channel 4-S 6227 10,000 85 MB/s

Two Gigabit Fibre Channel 4-W 6228 36,000 175 MB/s read130 MB/s write

Two Gigabit Fibre Channel PCI-X 5704 6239 45,000 190 MB/s simplex380 MB/s duplex

Chapter 5. Migrating 7133 to FAStT in AIX 101

http://www.storage.ibm.com/hardsoft/products/ssa/docs/pdfs/ssaplan.pdf

5.2 Sizing a solution based upon the SSA configurationSizing a solution from the standpoint of the SSA configuration is the simplest case.

As we have just seen, the appropriate number of FC adapters is 2 minimum (for redundancy), and at least one 6239 adapter for every 4.5 SSA adapters. This assumes an IOPS rate of 10,000 IOPS per SSA adapter as indicated in Table 5-3 on page 101.

Knowing that SSA adapters have optional cache, and then only up to 32 MB of write cache, then the fact that FAStT controller offers considerably more (128 MB to 2 GB), often leads to better performance.

On the other hand, this approach does not take into account the actual application IO rate nor considers if growth is needed.

For the disk subsystem, we need to choose:

� Number of disk drives � Disk spindle speed and size � FAStT model

For the number disks and their spindle speed, a good rule of thumb is to purchase at least half as many FAStT disks as one has SSA disks, of the same spindle speed or faster. In any case, ensure that enough storage is available.

5.3 Sizing a solution based upon the application IO ratesThis approach uses application IO data as the basis for sizing a solution that can provide equivalent or better disk performance.

The iostat command can be used a simple way to collect application data overtime, and then find the time period with the peak IO load. The peak IO load value is used as a basis for sizing the solution.

Note that if the system is using LVM mirroring, iostat reports reflect that writes are mirrored and in this case, we need to divide the writes by 2. This distinguishes between application IOs and physical IOs, as iostat provides a LVM view of IO while the physical disk IOs will be different if using RAID 1/5/10 and the application IOs will be different if using LVM mirroring.

You can manually determine peak IOPS periods from iostat data. To collect the data, use an interval of 1 hour over a relevant time period (for example, one week). Run the following command:

# iostat –s 3600 120 > iostat-ts.out

Then examine the iostat data (the file iostat-ts.out, in our example). Look for the peak IOPS period under the tps column (tps is Transactions Per Second which is equivalent to IOPS). You can use the following AIX commands to sort the data and print out the system tps for the top 10 intervals.

# grep –p System iostat-ts.out | grep [1234567890] | \ awk ‘{print $2}’ | sort –rn | head

The output from the iostat command is shown in Figure 5-1.


Figure 5-1 Output from iostat command

To determine the actual IOPS load from a LVM point of view, you need to consider the Read/Write (R/W) ratio.

Based on the example in Figure 5-1, the R/W ration would be calculated as follows:

% reads = 81860/(81860 + 42170) = 66%% writes = 1 - % reads = 34%R/W ratio = 66:34The peak IOPS is 1240.3.

Note that different methods could be used to calculate the R/W ratio. One is to use the R/W ratio from the first iostat report (data since the last system boot). Another is to use the ratio for the peak interval, as we have done. The recommended, conservative approach is to use the ratio with the highest write percentage.

If the system is performing LVM mirroring, then you need to adjust for the mirroring. Since 34% of the IOs are writes from LVM’s point of view, then only half of these are writes from the application’s point of view. So you subtract half the writes from the IOPS and recalculate the IOPS and R/W ratio as follows:

Read IOPS = 0.66 x 1240.3 = 818.6 IOPSWrite IOPS = (1240.3 -818.6)/2 = 210.9 IOPSTotal IOPS = 818.6+210.9 = 1029.5 IOPS% reads = 818.6/1029.5 = 79.5%% writes = 1 - % reads = 20.5%

So you would use 1029.5 IOPS, a R/W ratio of 79.5/20.5 and the relevant RAID level to be used in the FAStT to calculate the number of disks needed; then choose a FAStT model that will exceed the IOPS rate needed.

Use the following formulas to calculate the number of physical disks needed in the FAStT:

P = peak IOPS needed (at the application level)R = proportion of IOs that are reads (from 0 to 1)W = proportion of IOs that are writes (from the application’s point of view)

� For unprotected or RAID 0 solutions:

Disks needed = P/150 (it is assumed that a physical FAStT disk can perform 150 IOPS with good response time)

� For RAID 1 or RAID 10 solutions:

Disks needed = P (R + 2W)/150

� For RAID 5 FAStT solutions:

Disks needed = P(R + 4W)/150

For example, using the iostat data (and assuming it represents a peak interval, that we use LVM mirroring, and considering a RAID 5 solution), the FAStT would require:

1209.5(0.795 + 4 x 0.295) / 150 = 16 disks

Alternatively, if the FAStT solution used RAID 10, then it would require:


1209.5(0.795 + 2 x 0.295) / 150 = 12 disks

Eventually, this can also help you decide between RAID levels on the FAStT.

5.4 Sizing a solution for more performanceThis approach is typically used when you want to grow the application workload. Generally, the approach is to assume that the increase in application workload is proportional to the IO workload. Thus, one could take either of the first two approaches factoring in the percentage growth.

5.5 Setting up the FAStT prior to migrationAttaching a pSeries to a FAStT typically uses two FC adapters on the pSeries and two FC connections from the FAStT. A pSeries system can be directly attached, although typically attachment is via a SAN switch.

The following list summarizes the steps to perform when attaching the FAStT to an AIX host:

1. Install the FAStT hardware.2. Set up IP addresses on FAStT controllers.3. Install FC adapters in the pSeries system.4. Install SAN switch (if applicable).5. Install Storage Manager in the system managing the FAStT (that would be SMclient.aix.rte

if AIX is managing the system).6. Update FAStT NVSRAM, bootware, firmware and disk microcode.7. Delete the access LUN.8. Plan the FAStT RAID configuration and data layout.9. Configure RAID arrays.10.Configure an AIX storage partition(s) (aka host groups).11.Configure logical disks.12.Assign logical disks to storage partitions.13.Install supporting software in the pSeries system.14.Update pSeries FC adapter microcode and potentially system microcode.15.Run cfgmgr on the AIX system.16.Adjust the queue_depth attribute for FAStT hdisks on the AIX host.

For detailed information on planning tasks, the different RAID levels, array size, array configuration, hot spares, channel protection, and controller ownership, as well as other discussions related to segment size and caching, refer to Chapter 2, “FAStT planning tasks” on page 13.

For convenience, we include hereafter the setup tasks for the AIX host server attaching to a FAStT.

5.5.1 Install FAStT software on the AIX host serverBecause AIX is only supported with out-of-band management, the host software consists of only one package — the FAStT Storage Manager client.

Note: Detailed FAStT installation information can be found in Chapter 3, “FAStT configuration tasks” on page 57. You can also refer to the IBM Redbook, IBM TotalStorage: FAStT600/900 and Storage Manager 8.4, SG24-7010, for more details.


The disk array driver (RDAC), which provides redundancy in the I/O paths, is available as a PTF as part of the AIX operating system.

Since the access logical drive is not needed for an AIX host, ensure that no mapping exists for the access logical drive to an AIX host (in other words, delete the access LUN). Before installing SMclient, make sure that the following conditions are met:

� The AIX host on which you are installing SMruntime meets the minimum hardware and software requirements described in “Installation and Support Guide for AIX, HP-UX, and Solaris” for the release of Storage Manager you are installing.

� You have prepared the correct filesets for an AIX system. You can see the list of required filesets in “Installation and Support Guide for AIX, HP-UX, and Solaris”, which is provided with the machine or can be downloaded from the Internet site along with FAStT Storage Manager code. Go to:

http://www.ibm.com/storage/techsup.htm

Look for the FAStT Storage Server link under Technical Support.

To properly install the client software under AIX, you must first install SMruntime, followed by SMclient. (SMruntime provides the Java™ runtime environment required to run the SMclient.)

Installing SMruntimeYou may need to adjust the following instructions for the specifics of your installation. No restart is required during the installation process.

1. Install SMruntime by typing the following command:

# installp -a -d /complete path name/SMruntime.aix-08.30.65.00.bff SMruntime.aix.rte

2. Verify that the installation was successful by typing the following command:

# lslpp -ah SMruntime.aix.rte

The verification process should return a table that describes the software installation, including the install package file name, version number, action, and action status:

# lslpp -ah SMruntime.aix.rte

Fileset Level Action Status Date Time ----------------------------------------------------------------------------Path: /usr/lib/objrepos SMruntime.aix.rte 8.40.6500.0 COMMIT COMPLETE 06/20/03 14:10:33 8.40.6500.0 APPLY COMPLETE 06/20/03 14:10:33

Installing SMclientYou may need to adjust the following instructions for the specifics of your installation.

1. Install SMclient by typing the following command:

# installp -a -d /complete path name/SMclient.aix-08..33.G5.03.bff SMclient.aix.rte

2. Verify that the installation was successful by typing the following command:

# lslpp -ah SMclient.aix.rte

The verification process should return a table that describes the software installation, including the install package file name, version number, action, and action status:

[# lslpp -ah SMclient.aix.rte Fileset Level Action Status Date Time ----------------------------------------------------------------------------


http://www.ibm.com/storage/techsup.htm

Path: /usr/lib/objrepos SMclient.aix.rte 98.40.6500.0 COMMIT COMPLETE 06/20/03 14:14:28 98.40.6500.0 APPLY COMPLETE 06/20/03 14:14:28

Performing the initial configuration on AIX hostsComplete the installation by defining logical drives. For instructions on how to do this, please refer to “Creating arrays and logical drives” on page 72. Logical drives, partitioning, and all other related tasks can be done also from the AIX Storage Manager client (see Figure 5-2).

To start the client, issue the following command:

# /usr/SMclient/SMclient

Figure 5-2 Storage Manager client 8.4 for AIX

After you set up an AIX host group, perform the following steps to verify that the host ports match the AIX host:

1. Type the following command:

# lsdev -Cc adapter | grep fcs

A list that contains all the HBAs that are in the system is displayed, as shown in the following example:

# lsdev -Cc adapter | grep fcsfcs0 Available 20-58 FC Adapterfcs1 Available 20-60 FC Adapter

2. Identify the fcs number of the HBA that is connected to the FAStT.

3. Type the following command:

# lscfg -vl fcs? |grep Network

In this command, fcs? is the fcs number of the HBA that is connected to the FAStT.

The network address number of the HBA is displayed, as in the following example:

# lscfg -vl fcs0 | grep Network Network Address.............10000000C926B08F

4. Verify that the network address number matches the host port number that displays in the host partition table of the FAStT SMclient.

5. Repeat this procedure to verify the second or more host ports.

Installing the RDAC driverThe RDAC driver must be installed on all AIX hosts that will be attached to a FAStT storage subsystem.


You need the following filesets (for AIX v5.2):

� devices.fcp.disk.array.rte - RDAC software� devicess.fcp.disk.array.diag - RDAC software� devices.fcp.disk.rte - FC Disk Software� devices.common.IBM.fc.rte - Common FC Software

Depending on the HBA, you need:

� devices.pci.df1000f7.com � devices.pci.df1000f7.rte � devices.pci.df1000f9.rte

Before installing the RDAC driver always check prerequisites for a list of required driver version level filesets on the AIX system. Prerequisites can be found in “Installation and Support Guide for AIX, HP-UX, and Solaris” provided with the machine or with the latest Storage Manager software.

Use the lslpp command to verify that the correct driver versions are installed:

# lslpp -ah devices.fcp.disk.array.rte

The RDAC driver creates the following devices that represent the FAStT storage subsystem configuration:

� dar (disk array router): This represents the entire array, including the current and the deferred paths to all LUNs (hdisks on AIX).

� dac (disk array controller devices): These devices represent a controller within the storage subsystem. There are two dacs in the storage subsystem.

� hdisk: These devices represent individual LUNs on the array.

When these devices are configured, the Object Data Manager (ODM) is updated with default parameters. In most cases and for most configurations, the default parameters are satisfactory. However, there are some parameters that can be modified for maximum performance and availability. See the Installation and Support Guide for AIX, HP-UX, and Solaris provided with the Storage Manager software.

After the FAStT storage subsystem has been set up, volumes have been assigned to the host, and the RDAC driver has been installed, you must verify that all of your FAStT device names and paths are correct and that AIX recognizes your dars, dacs, and hdisks.

You must do this before you mount file systems and install applications. Type the following command to probe for the new devices:

# cfgmgr -v

Next, use the lsdev -Cc disk command to see if the RDAC software recognizes each FAStT volume correctly:

� FAStT200 volume as a “3542 (200) Disk Array Device”� FAStT500 volume as a “3552 (500) Disk Array Device” � FAStT600 volume as a “1722 (600) Disk Array Device”� FAStT700 volume as a “1742 (700) Disk Array Device”� FAStT900 volume as a “1742-900 Disk Array Device”

This coding example illustrates the results of the command for a set of FAStT700 LUNs:

#lsdev -Cc diskhdisk0 Available 10-60-00-4,0 16 Bit LVD SCSI Disk Drivehdisk1 Available 20-58-01 1742 (700) Disk Array Devicehdisk2 Available 20-60-01 1742 (700) Disk Array Device


After the operating system device names are found, those names must be correlated to the preferred and failover paths of the FAStT device, and then from each path to its associated logical drive.

AIX provides the following commands to help you determine the FAStT configuration, and to get information about device names and bus numbers:

� lsdev

Displays devices and their characteristics. The lsdev command shows the state of the devices at startup time, or the last time that the cfgmgr -v command was run.

� lsattr

Displays device attributes and possible values. Attributes are only updated at startup time, or the last time that the cfgmgr -v command was run.

� fget_config

Displays controllers and hdisks that are associated with a specified FAStT (dar). The fget_confg command shows the current state and volume (hdisk) ownership.

There are several ways to correlate a system’s configuration and monitor the state of FAStT storage subsystems.

Example 5-1 uses the lsdev command to show the status of the dar. This example shows dar as a machine type 1742, which is a FAStT700. It is in the Available state, which is the state at the time the device was last configured by AIX.

The example also shows the status of two dacs, which represent the FAStT storage subsystem controllers. The third column shows the location code. In this example, each dac has its own location or path, which are represented by the values 20-58-01 and 20-60-01. Each AIX system has its own set of location codes that describe the internal path of that device, including bus and host adapter locations. See the service manual for your system type to identify device locations.

Example 5-1 Status of dar

# lsdev -C | grep dardar0 Available 1742 (700) Disk Array Router

# lsdev -C | grep dacdac0 Available 20-58-01 1742 (700) Disk Array Controllerdac1 Available 20-60-01 1742 (700) Disk Array Controller

Example 5-2 uses the lsdev command to show the status and location codes of two FAStT700 hdisks. Notice that the location codes of the hdisk1 correspond to the same location code of dac0 in the previous example, and that the even hdisk2 corresponds to the same location code of dac1. This means that the preferred paths for I/O for hdisk1 is through dac0. The failover path would be through dac1. Conversely, the preferred path for the hdisk2 would be through dac1, and failover path through dac0.

Example 5-2 Status and location codes

#lsdev -Cc diskhdisk0 Available 10-60-00-4,0 16 Bit LVD SCSI Disk Drivehdisk1 Available 20-58-01 1742 (700) Disk Array Devicehdisk2 Available 20-60-01 1742 (700) Disk Array Device

The fget_config command displays the state of each controller in a FAStT array, and the current path that is being used for I/O for each hdisk.


Example 5-3 shows that both controllers (dac0 and dac1) are in the Active state. This is normal when the FAStT storage subsystem is configured correctly. Other possible states could be:

� NONE: The controller is not defined or is offline.� RESET: The controller is in the reset state.

Example 5-3 Controller status

# fget_config -l dar0dac0 ACTIVE dac1 ACTIVEdac0-hdisk1dac1-hdisk2

The lsattr command provides detailed information about a volume, including information that allows you to map the system device name to the logical volume on the FAStT storage subsystem. In Example 5-4 we run the lsattr command on the LUN named hdisk1. It provides the following information: It is a 36 GB LUN of type RAID 5, with a LUN ID of 0, and an IEEE volume name of 600A0B80000CD96D000000063EF70D0C. You can make a quick identification by locating the LUN ID on the far right side of the Mappings View tab.

Example 5-4 Volume information

# lsattr -El hdisk1cache_method fast_write Write Caching method Trueieee_volname 600A0B80000CD96D000000063EF70D0C IEEE Unique volume name Falselun_id 0x0000000000000000 Logical Unit Number Falseprefetch_mult 1 Multiple of blocks to prefetch on read Truepvid none Physical volume identifier Falseq_type simple Queuing Type Falsequeue_depth 10 Queue Depth Trueraid_level 5 RAID Level Falsereassign_to 120 Reassign Timeout value Truereserve_lock yes RESERVE device on open Truerw_timeout 30 Read/Write Timeout value Truescsi_id 0x11100 SCSI ID Falsesize 36864 Size in Mbytes Falsewrite_cache yes Write Caching enabled True

You can make a more precise correlation regarding which LUN is using the distinctive ieee_volname attribute. The value of this attribute on the AIX host is the same as the Unique Logical Drive Identifier on the FAStT storage subsystem. The Unique Logical Drive Identifier can be found in a Storage Manager window by right-clicking Logical Drive Name -> Properties. Look for the Volume ID, Capacity, and RAID level properties (see Figure 5-3).


Figure 5-3 Logical Drive Properties

Changing ODM attribute settings in AIXYou can actually change the ODM attributes for the RDAC driver and FAStT, and here we describe the settings that can be used for best performance and availability on hdisk devices.

This section lists the attribute settings that you should use for hdisk devices and shows how to set them using the chdev -l command. To make the attribute changes permanent in the Customized Devices object class, use the -P option. Some attributes can be changed from both the SMclient and through AIX. To make these changes permanent, the modifications must be made by changing the AIX ODM attribute.

The specific hdisk attributes affected are as follows (see also Example 5-4 on page 109):

� write_cache

Indicator that shows whether write-caching is enabled on this device (yes) or not (no).

� cache_method

If write_cache is enabled, the write-caching method of this array is set to one of the following values:

– default — Default mode; the word default is not seen if write_cache is set to yes.

– fast_write — Fast-write (battery-backed, mirrored write-cache) mode.

– fw_unavail — Fast-write mode was specified but could not be enabled; write-caching is not in use.

– fast_load — Fast-load (non-battery backed, non-mirrored write-cache) mode.

– fl_unavail — Fast-load mode was specified but could not be enabled.

� prefetch_mult

Number of blocks to be prefetched into read cache for each block read (value: 0–100).

� queue_depth

Number that specifies the depth of the queue based on system configuration. Reduce this number if the array is returning a BUSY status on a consistent basis (value: 1–64)


If the changes are made through the SMclient, they will operate properly until you either restart the host or restart cfgmgr. To make the changes permanent, you must use SMIT or the chdev -P command.

Number of logical disks and queue depth for hdisk devicesSetting the queue_depth attribute to the appropriate value is important for system performance. For large FAStT configurations with many volumes and hosts attached, this is a critical setting for high availability.

The overall queue depth can total up to 1024 for SM 8.4, or 512 for SM 8.3. And the maximum queue_depth for each FAStT hdisk (FAStT logical disk) on AIX is 64; It is thus recommended to create at least 8 logical disks in SM 8.3 and least 16 logical disks in SM 8.4.

Use the following formula (assuming SM 8.4) to determine the maximum queue depth for your system:

1024 / (number-of-hosts * LUNs-per-host)

For example, a system with four hosts, each with 32 LUNs (the maximum number of LUNs per AIX host), would have a maximum queue depth of 4:

1024 / ( 4 * 32 ) = 4

In this case, you would set the queue_depth attribute for hdiskX as follows:

#chdev -l hdiskX -a queue_depth=4 -P

Logical disks are assigned to controllers, and since there are two controllers, you can balance the load across the controllers. Finally, to balance the IOs across the disks, it makes sense to make the logical disks all the same size. Keep in mind that the FAStT allows you to dynamically increase the size of the logical disks if needed, and this is supported at AIX 5.2 via the chvg –g command.

As far as the amount of storage required, create logical disks with enough storage to replace all the SSA storage, after accounting for LVM mirroring. This should be done for each SSA Volume Group (VG) that is being replaced; thus, if one SSA VG has 100 GB of storage, and another has 200 GB of storage (with no LVM mirroring), you’ll need one set of disks totaling 100 GB for the first VG, and another set totaling 200 GB for the second VG.

5.6 Performing the migrationSince the migration method exposed is based primarily on the features and capabilities of the AIX Logical Volume Manager, we first review the concepts and components of LVM. Readers familiar with LVM can proceed directly to section 5.6.2, “Migration procedure”.

Attention: Controller cache mirroring should not be disabled while cache_write is enabled. If this condition exists, the RDAC software will automatically re-enable it the next time the system is restarted, or when cfgmgr is run.

Attention: If you do not set the queue depth to the proper level, you might experience a loss of your file system.


5.6.1 Logical Volume ManagerThe Logical Volume Manager controls disk resources by mapping data between a simple and flexible logical view of storage space and the actual physical disks. The Logical Volume Manager does this by using a layer of device driver code that runs above the traditional physical device drivers. This logical view of the disk storage is provided to applications and is independent of the underlying physical disk structure. Figure 5-4 illustrates the layout of those components.

Figure 5-4 Logical Volume Manager architecture

A hierarchy of structures is used to manage the actual disk storage and there is a well defined relationship among these structures.

Each individual disk drive is called a physical volume (PV) and has a name, usually /dev/hdiskx (where x is a unique integer on the system). Every physical volume in use belongs to a volume group (VG) unless it is being used as a raw storage device.

Each physical volume consists of a number of disks (or platters) stacked one above the other. Each is divided into physical partitions (PPs) of a fixed size for that physical volume.

The Logical Volume Manager controls disk resources by mapping data between a simple and flexible logical view of storage space and the actual physical disks. The Logical Volume Manager does this by using a layer of device driver code that runs above the traditional physical device drivers. This logical view of the disk storage is provided to applications and is independent of the underlying physical disk structure (see Figure 5-5).

PhysicalVolume

PhysicalVolume

PhysicalVolume

ApplicationLayer

LogicalLayer

PhysicalLayer

LogicalVolumeManager

JournaledFile System

RawLogical Volume

VolumeGroup

Logical Volume Logical Volume

PhysicalDisk

PhysicalDisk

PhysicalArray

Logical Volume Device Driver

Device Driver RAID Adapter


Figure 5-5 Relationships between LP and PP

The following relationships exist between volume groups and physical volumes:

� On a single system, one to many physical volumes can make up a volume group.

� Physical volumes cannot be shared between volume groups.

� The entire physical volume becomes part of the volume group.

� The LVM is physical volume independent, thus, different types of physical volumes can make up a volume group within the limitation imposed by the partition size and partition limit.

When you install the system, one volume group (called the root volume group, rootvg) is created. New volume groups can be created using the mkvg command. Physical volumes are added to a VG using the extendvg command, or can be removed with the reducevg command.

Within each volume group, one or more logical volumes (LVs) are defined. Logical volumes are the way to group information located on one or more physical volumes. Logical volumes are an area of disk used to store data, which appears to be contiguous to the application, but can be non-contiguous on the actual physical volume. It is this definition of a logical volume that allows them to be extended, relocated, span multiple physical volumes, and have their contents replicated for greater flexibility and availability.

Each logical volume consists of one or more logical partitions (LPs). Each logical partition corresponds to at least one physical partition (PP). If the logical volume is mirrored, then additional physical partitions are allocated to store the additional copies of each logical partition.

Disk independenceOne great feature of the LVM, and upon which the migration procedure is based, is its ability to support a mixture of disks in a volume group, regardless of their type or location. Thus SSA, Serial, SCSI, and RAID drives can make up one volume group and may reside across a number of adapters.

LP1 LP8 LP9 LP8 LP9Logical Volume - Iv04 Logical Volume - mirrlv

PP8

PP42

PP19 PP18

PP45

PP18

PP45

Physical Volume(/dev/hdisk9) LP: Logical Partitions

PP: Physical Partitions


The basic design of the interaction between the LVM and the disk device driver always ensures that the LVM’s use of the physical volume will have the same behavior, regardless of the type of physical volume being used in the LVM. Thus, a physical volume, such as a SSA disk, behaves the same in the LVM as a FAStT FC disk drive, although they use different adapter and device drivers.

5.6.2 Migration procedureYou can avoid system downtime by performing an online migration. However, the ability to do an online migration depends upon:

� The current LVM setup:

There are LVM limits that may affect some volume groups and limit the choice to offline migration. The relevant limits are the number of physical volumes (hdisks) in a VG, and the maximum number of physical partitions (PPs) in a VG, as indicated in Table 5-4.

Table 5-4 LVM Limits

� Having slots available for both SSA and FC adapters.

If your installation does not meet the above conditions, an offline migration with tape is required.

Online migration can be accomplished by adding PVs to the VG (using the extendvg command), then using either of the following methods:

� Method 1: Mirroring the VG (using the mirrorvg command) and removing the copy from the SSA disks (with the unmirrorvg or rmlvcopy command)

� Method 2: Migrating the data from SSA PV to FAStT PVs (using the migratepv command)

If a regular VG is used, and there isn’t enough space for expanding the VG, change the VG to a big VG via the chvg command. This will work as long as there are enough free PPs in the VG. Issue the following commands in sequence to change a regular VG to a big VG:

Example 5-5 Converting a VG o a big VG

# varyoffvg myvg (This will require unmounting filesystems first)# chvg –B myvg# varyonvg myvg

The first method has the advantage of simplicity, but requires that the VG use less than half the LVM limit for PPs in a VG.

The second method has an advantage in that, if you cannot add all the necessary PVs to the VG to migrate all the SSA disks at once, you can add one PV at a time, move data from an SSA PV to a FAStT PV, and then remove the SSA disk from the VG, and then repeat this process until all the data is migrated.

Regular VG Big VG

Maximum PVs per VG 32 128

Maximum PPs per VG 32,512 130,048


You can determine the number of PPs in a VG with the lsvg <vgname> command, as shown in Figure 5-6.

Figure 5-6 Sample lsvg output

5.6.3 Mirroring the VG (method 1) illustrationLet’s assume that the data to be migrated from SSA disks reside in a volume group, datavg, that consists of ten physical volumes named hdisk1 to hdisk10. Assume that the physical volumes use a total of 10,000 physical partitions. We also assume that the number of target FAStT hdisks match the number of SSA hdisks.

In this example, we can easily add enough storage space to mirror the Volume Group without making it a big VG (see Table 5-4).

We can define hdisk11 thru hdisk20 to represent new FAStT disks to replace the SSA disks.

We proceed as follows:

# extendvg datavg hdisk11 hdisk12 hdisk13 hdisk14 hdisk15 \ > hdisk16 hdisk17 hdisk18 hdisk19 hdisk20# mirrorvg datavg hdisk11 hdisk12 hdisk13 hdisk14 hdisk15 \ > hdisk16 hdisk17 hdisk18 hdisk19 hdisk20

The mirrorvg step will take quite awhile while it mirrors and synchronizes all the data.

We should then check that the mirroring is complete using # lsvg datavg and ensure that the STALE PVs field is 0. Finally, we can remove the mirror on the SSA disks as follows:

# unmirrorvg datavg hdisk1 hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 \ > hdisk7 hdisk8 hdisk9 hdisk10

Then, remove the SSA disks from the VG:

# reducevgvg datavg hdisk1 hdisk2 hdisk3 hdisk4 hdisk5 hdisk6 \ > hdisk7 hdisk8 hdisk9 hdisk10

Or, to reduce the typing, use an in-line script as follows:

# for i in 1 2 3 4 5 6 7 8 9 10> do> reducevg datavg hdisk$i> done

Note that if the VG had more than 16,256 PPs (more than half the maximum number of PPs in a regular VG), you would make it a big VG first as illustrated in Example 5-5.


5.6.4 Migrating PVs (method 2) illustrationIf the VG has more than 65,024PPs (more than half the limit for a big VG), the mirroring cannot be used. If, however, there are less than 117,042 PPs (130048- (130048/10)), we can use the migration method.

Assuming that we could only add one disk at a time to the VG, and that we use 10 FAStT PVs to replace the SSA disks, we would proceed as follows:

# extendvg datavg hdisk11# migratepv hdisk1 hdisk11# reducevg datavg hdisk1 # extendvg datavg hdisk12# migratepv hdisk2 hdisk12# reducevg datavg hdisk2 …# extendvg datavg hdisk20# migratepv hdisk10 hdisk20# reducevg datavg hdisk1

5.6.5 Other considerationsNote that if rootvg resides on SSA, then there are additional requirements for booting from FAStT. Not all AIX systems support boot from FAStT, and specific minimum requirements exist for FC adapter, system, and SAN switch microcode (installing the latest microcode is recommended in any case).

5.6.6 HACMP considerationsThe above LVM migration procedures can be used from HACMP CSPOC (though older versions of HACMP have more limited support). The main requirements and configuration limitations for HACMP on the FAStT are documented at

ftp://ftp.software.ibm.com/storage/fastt/fastt500/HACMP_config_info.pdf

It is worth noting that a single FAStT can be connected to more systems than SSA, which is limited to eight. SSA does have the capability to use target mode communications for a heartbeat path (two independent heartbeat paths are required for systems in a HACMP cluster); thus, if the customer uses target mode SSA for the second heartbeat path, then an alternative heartbeat path must be determined for the systems. Alternatives include:

� Using the SSA adapters without SSA disk

� Using RS-232 cables

� Using disk heartbeat at AIX 5.1 or 5.2 with HACMP 5.1 (AIX 5.1 requires an enhanced concurrent VG for such purposes which cannot be used in a rotating or cascading resource group, a restriction that doesn’t exist at AIX 5.2)

� Using another physically independent LAN

Note: smit panels exist for the commands that we have used, including migratepv, mirrorvg, unmirrorvg, extendvg, reducevg, rmlvcopy, chvg, varyonvg, and varyoffvg.

Tip: A good practice is to try the process on a test Volume Group.



With more than two nodes in a HACMP cluster, a heartbeat ring is configured; thus with nodeA, nodeB and nodeC, heartbeat paths are setup from nodeA to nodeB, nodeB to nodeC and nodeC to nodeA.

SSA Fiber extendersSSA fiber extenders allow SSA disk to be attached up to 10 km away. However, from a performance perspective, these distances are limited to about 2 km for high transaction rates. The FAStT can be attached at up to 10,000 km without any performance degradation. This requires two FC switches (one at each location) and a switch to switch longwave optic link between them.

Generally, if a SSA extenders are used, there is also a second system at the second location to access the data in the event of a site disaster. And often this is in a HACMP cluster. The SSA data can be mirrored using LVM, or use SSA RAID 10 to ensure a copy of the data is available. We can mirror data with LVM mirroring also, but the FAStT doesn’t support a RAID 10 array with a copy in two separate FAStT subsystem.

Nevertheless, Figure 5-7 shows the alternative configuration when the customer is using SSA extenders in a HACMP cluster.

Figure 5-7 FAStT mirror using LVM

Another alternative is to use FAStT remote volume mirroring. This alternative would require manual intervention in the event of a site disaster, failure of the primary server, or the primary FAStT.

LAN bridge

Modem Modem

FC switch

FC switch

FC switch

FC switch

LVM mirrorsFAStT FAStT


Chapter 6. IBM migration services

Storage technologies are continually improving and offering a better price/performance ratio. Not only can you buy bigger, faster, and cheaper disks to handle the growing amount of data needed online, but today, new storage system functions can add flexibility for on demand computing, help simplify storage management, and provide automation to help reduce manpower needs — all helping to lower the total cost of storage.

Yet many IT groups postpone highly justifiable upgrades simply because they are concerned that the upgrade process itself could disrupt ongoing business operations. IBM can help address that concern with hardware assisted data migration services.

6


6.1 Data migrationThe traditional data migration methods can typically be categorized as follows:

� Copy on tape and restore:

The traditional path to migration, a tape dump, involves manually taking a point-in-time copy of data and staging it to tape. After new storage devices are installed, the data on tape is restored to the new device. Conceptually simple, this process has one significant disadvantage: it can involve significant downtime.

To maintain data consistency, application access to stored data generally must be stopped throughout the backup and restore process. For business applications that can accommodate an outage, this procedure is a simple and straightforward means of populating new storage devices. However, those applications with demanding uptime requirements need a less intrusive alternative.

� Host-based software approach:

Host-based software data transfer tools help address some of the inadequacies of traditional tape-based migration by allowing a degree of application access to data during the copy process. The host software captures application I/O operations before they leave the server and replicates them on the new storage devices. But installing the software may require one or more reboots or IPLs. This approach typically requires additional software between the operating system and the physical devices on each host system being migrated. Often, additional hardware may be necessary to maintain host system performance at pre-migration levels.

6.2 Piper: IBM hardware-assisted data migrationIBM hardware-assisted data migration services can help address many of the problems of these traditional approaches. The IBM services team uses an appliance (also known as Piper) that transfers data between source and target volumes. This portable appliance is designed to be easily installed.

The Piper based migration is designed to be accomplished while the host is either online or offline. Obviously, online migration helps minimize the need for application downtime during storage upgrades. But if downtime is not a concern, offline migration (where the host applications are stopped) allows for even greater data migration speed.

6.3 What is Piper Lite?Piper Lite is the data migration hardware appliance used by IBM for open systems, modeled on the successful Piper.

Piper Lite allows small to medium enterprises to migrate to new storage technologies easily with minimum down time. The migration appliance, light-weight and self contained, in a rugged case with telescoping handle and wheels, can easily be transported as checked baggage. It is essentially a SAN and FC / FC router configuration, with enabling software and scripts that can be used for LUN level data migration between heterogeneous enterprise storage platforms.

Piper Lite has the ability to migrate 4 host systems online, in a single path configuration, simultaneously, and can perform a migration of up to 256 LUNs at speeds of 180 to 216 GB/hr. (Offline migration speeds are typically at the higher throughput). Typically, you would be able to migrate 1 TB in less than 5 hours.


Support is available for Windows, AIX, HP-UX, Solaris, Novell, DGUX, Dynix, Linux, NCR, True64.

Piper Lite featuresPiper Lite offers the following features:

� Graphical User Interface (GUI) driven migration application (Figure 6-1 on page 123).

� Two new applications

� Migration Director (Figure 6-1 on page 123)

� Surveyor (Figure 6-2 on page 123)

� Offline migration planning:

– Ability to import host discovery script outputs– Ability to plan and save multiple migration sessions

� Eliminates the need for multiple map files

� Upload of migration plan to speed time to migration

– Maintain multiple migration sessions from a single plan

6.3.1 Piper hardware appliance The Fibre Channel Data Migration Tool has been constructed from modules based on proven FC (Fibre Channel) technology.

Piper contains:

� 2 x data migration engines� 1 x ethernet hub� 1 x management laptop� Interface support for FC, SCSI, and SSA � Management software and execution scripts� Power distribution

Piper offers:

� Ability to migrate 1 TB in less than 5 hours� Support for Windows, AIX, HP-UX, Solaris, Novell, DGUX, Dynix, Linux, NCR, True64

Table 6-1 Piper comparison chart

Piper Lite Piper Medium Piper Heavy

2 x DME 4 x DME 8 x DME

180-216 GB/hr 360-432 GB/hr 720-864 GB/hr

4 host systems in parallel (online)



Single path capable Multi path capable Multi path capable

256 LUNs for online migration 512 LUNs for online single path256 LUNs for online multi path

1024 LUNs for online single path512 LUNs for online multi path

512 LUNs for offline 1024 LUNs for Offline 2048 LUNs for offline

FC, SCSI, SSA capable FC, SCSI, SSA capable FC, SCSI, SSA capable

Chapter 6. IBM migration services 121

6.3.2 Piper Migration DirectorMigration Director, the application specifically designed for data migration, runs on the laptop integrated into the Piper appliance and provides unparalleled migration functionality with the ability to order sequence of LUN migration:

� It offers host and device side WWNN and WWPN spoofing:

– No longer do you need to learn all storage management applications.– It removes the requirement for altering LUN assignments and SAN zoning.

� Source drive protection:

– The protection is no longer just at the vendor level.– WWNN and WWPN protection provide source drive protection for like-to-like situations,

or when storage frames are from the same vendor.

� It now offers the ability to alter the order in which LUNs are migrated:

– This can be done before and during the migration:

• You can force the completion of a set of LUNs, which may be associated with a particular host system, in the event that a cut-over window changes.

• It is also possible to alter migration speeds and threads.

� It no longer requires the use of the CLI (Command Line Interface) to alter the migration speed or number of threads assigned.

� No longer are migrations limited to the CLI.

� GUI driven applications remove the need for extended knowledge of CLI commands and flags.

� CLI is still available.

Management via provided laptop

Management via provided laptop

Management via rack mounted PC

Single power feed Single power feed Redundant power feeds

GUI migration software GUI migration software Command line migration software

Migration planning software Migration planning software Migration planning software

WWNN / WWPN based source drive protection



Compact footprint (checked baggage)

Compact footprint (checked baggage)

Large footprint (19” half height rack)

SMB level migrations SMB and enterprise level migration

SMB and enterprise level migration

Migrates 1TB in 5 hours Migrates 2 TB in 5 hours Migrates 4 TB in 5 hours

Piper Lite Piper Medium Piper Heavy


Figure 6-1 A screen from Piper Migration Director

6.3.3 Piper Migration SurveyorMigration Surveyor is specifically designed for migration planning.

This standalone application allows you to build the migration plan without the need for the Piper hardware to be connected to your system.

Figure 6-2 A screen from the Piper Migration Surveyor


Surveyor features:

� Can import host discovery outputs to build a view of the environment� Variable input fields allowing for the import of non-standard formatted data� Can build the migration session before getting to the customer site� Standardizes the migration process� Provides a standard output for use with the Director application� Offers one-button output

This survey data is then provided to IBM for analysis and used for planing the migration activity.

6.3.4 Data migration processThe data migration process typically follows this process:

� It installs the data migration hardware appliance in the data path.

� It migrates the data from the source volumes to the target volumes. All host system I/O is accessed through the data migration hardware appliance.

� It removes the data migration hardware appliance from the data path upon completion of the migration and attaches the target volumes to the host systems.

Generally no more intrusive than a Fibre Channel switch, the data migration appliance is designed for simple installation. IBM services professionals simply connect it to the host system’s storage interfaces shown in Figure 6-3. Because the data migration appliance can be configured before it is placed in the data path, the time to physically attach and remove it can be as little as 15 minutes, generally much less time than required for software-based migration approaches.

Figure 6-3 Piper Lite connected in customer system

Block diagram showing component layout

FC Switch

Host ServerHost Server Host Server Host Server

Piper LiteMigration Device

SourceStorage

TargetStorage


Typically, no changes to the customer host systems or storage area network (SAN) fabric are required. The data-moving elements of the data migration hardware appliance report the source storage system’s vital product data. Because no data flows through the management interface, there is no external access to the proprietary data being migrated — thereby helping maintain data security and integrity.

Since the migration device performs a block level data copy, it moves everything from the source LUN to the target LUN, including the file system information. When the host is booted from the target storage, the file system information will be identical to when it was last shut down on the source storage.

Note: For more information on the Piper Lite Migration Tool, please contact your IBM Service representative or visit:

http://www.storage.ibm.com/services/featured/hardware_assist.html


http://www.storage.ibm.com/services/featured/hardware_assist.html

Chapter 7. FAStT and HACMP for AIX

In this chapter, we present and discuss configuration information relevant to the FAStT Storage Server attached to IBM Eserver® pSeries with High Availability Cluster Multiprocessing (HACMP) installed under AIX.

HACMP is the IBM software for building highly available clusters on IBM Scalable POWER™ (SP) parallel systems or a combination of pSeries systems, or both. It is supported by a wide range of IBM Eserver pSeries systems, with the new storage systems, and network types, and it is one of the highest-rated, UNIX-based clustering solutions in the industry.

7


7.1 HACMP introductionClustering (of servers) is the linking of two or more computers or nodes into a single, unified resource. High-availability clusters are designed to provide continuous access to business-critical data and applications through component redundancy and application failover. HACMP links IBM Eserver pSeries servers or logical partitions (LPARs) of pSeries servers into high-availability clusters. These servers or LPARs can also be part of an IBM Cluster 1600, which can simplify multisystem management and help reduce the cost of ownership.

HACMP provides concurrent access to IT resources and the fault resilience required for business-critical applications. It is designed to automatically detect system or network failures and eliminate a single point-of-failure by managing failover to a recovery processor with a minimal loss of end-user time.

The current release of HACMP can detect and react to software failures severe enough to cause a system crash and network or adapter failures. The Enhanced Scalability capabilities of HACMP offer additional availability benefits through the use of the Reliable Scalable Cluster Technology (RSCT) function of AIX. The Concurrent Resource Manager of HACMP provides concurrent access to shared disks in a highly available cluster, allowing tailored actions to be taken during takeover to suit business needs. HACMP can also detect software problems that are not severe enough to interrupt proper operation of the system, such as process failure or exhaustion of system resources. HACMP monitors, detects, and reacts to such failure events, allowing the system to stay available during random, unexpected software problems. HACMP can be configured to react to hundreds of system events.

HACMP makes use of redundant hardware configured in the cluster to keep an application running, restarting it on a backup processor if necessary. This minimizes expensive downtime for both planned and unplanned outages and provides flexibility to accommodate changing business needs. Up to 32 pSeries or IBM RS/6000® servers can participate in an HACMP cluster, ideal for an environment requiring horizontal growth with rock-solid reliability.

Using HACMP can virtually eliminate planned outages, because users, applications, and data can be moved to backup systems during scheduled system maintenance. Such advanced features as Cluster Single Point of Control and Dynamic Reconfiguration allow the automatic addition of users, files, hardware, and security functions without stopping mission-critical jobs.

HACMP clusters can be configured to meet complex and varied application availability and recovery needs. Configurations can include mutual takeover or idle standby recovery processes. With an HACMP mutual takeover configuration, applications and their workloads are assigned to specific servers, thus maximizing application throughput and leveraging investments in hardware and software. In an idle standby configuration, an extra node is added to the cluster to back up any of the other nodes in the cluster.

In an HACMP environment, each server in a cluster is a node. Each node has access to shared disk resources that are accessed by other nodes. When there is a failure, HACMP transfers ownership of shared disks and other resources based on how you define the relationship among nodes in a cluster. This process is known as node failover or node failback. HACMP supports two modes of operation:

� HACMP classic:

– High Availability Subsystem (HAS).– Concurrent Resource Manager (CRM).– High Availability Network File System (HANFS); this is included in HACMP and

HACMP/ES since Version 4.4.0.


� HACMP/ES (from Version 5.2, only HACMP/ES is on the market):

– Enhanced Scalability (ES).– Enhanced Scalability Concurrent Resource Manager (ESCRM).

HACMP classicHigh Availability Subsystem (HAS) uses the global Object Data Manager (ODM) to store information about the cluster configuration and can have up to eight HACMP nodes in a HAS cluster. HAS provides the base services for cluster membership, system management, and configuration integrity. Control, failover, recovery, cluster status, and monitoring facilities are also there for programmers and system administrators.

The Concurrent Resource Manager (CRM) feature optionally adds the concurrent shared-access management for the supported RAID and SSA disk subsystem. Concurrent access is provided at the raw logical volume level, and the applications that use CRM must be able to control access to the shared data. The CRM includes the HAS, which provides a distributed locking facility to support access to shared data.

Before HACMP Version 4.4.0, if there was a need for a system to have high availability on a network file system (NFS), the system had to use high availability for the network file system (HANFS). HANFS Version 4.3.1 and earlier for AIX software provides a reliable NFS server capability by allowing a backup processor to recover current NFS activity should the primary NFS server fail. The HANFS for AIX software supports only two nodes in a cluster.

Since HACMP Version 4.4.0, the HANFS features are included in HACMP, and therefore, the HANFS is no longer a separate software product.

HACMP/ES and ESCRMScalability, support of large clusters, and therefore, large configurations of nodes and potentially disks leads to a requirement to manage “clusters” of nodes. To address management issues and take advantage of new disk attachment technologies, HACMP Enhanced Scalable (HACMP/ES) was released. This was originally only available for the SP where tools were already in place with PSSP to manage larger clusters.

ESCRM optionally adds concurrent shared-access management for the supported RAID and SSA disk subsystems. Concurrent access is provided at the raw disk level. The application must support some mechanism to control access to the shared data, such as locking. The ESCRM components includes the HACMP/ES components and the HACMP distributed lock manager.

7.2 Supported environment

For up-to-date information on the supported environments, refer to:


Important: Before installing FAStT in an HACMP environment, always read the AIX readme file, the FAStT readme for the specific Storage Manager version and model, and the HACMP configuration information.

Chapter 7. FAStT and HACMP for AIX 129


7.2.1 General rulesThe primary goal of an HACMP environment is to eliminate single points of failure. Figure 7-1 contains a diagram of a two-node HACMP cluster (this is not a limitation; you can have more nodes) attached to FAStT Storage Server through a fully redundant Storage Area Network. This type of configuration eliminates a Fibre Channel (FC) adapter, switch, or cable from being a single point of failure (HACMP itself protects against a node failure).

Using only one single FC switch would be possible (with additional zoning), but would be considered a single point of failure. If the FC switch fails, you cannot access the FAStT volumes from either HACMP cluster node. So, with only a single FC switch, HACMP would be useless in the event of a switch failure. This example would be the recommended configuration for a fully redundant production environment. Each HACMP cluster node should also contain two Fibre Channel host adapters to eliminate the adapter as a single point of failure. Notice also that each adapter in a particular cluster node goes to a separate switch (cross cabling).

FAStT models can be ordered with more hosts ports. In the previous example, only two host attachments are needed. Buying additional mini hubs is not necessary, but can be done for performance or security reasons. Zoning on the FC switch must be done as detailed in Figure 7-1. Every adapter in the AIX system can see only one controller (these are AIX-specific zoning restrictions, not HACMP specific).

Figure 7-1 HACMP cluster with attachment to FAStT

7.2.2 Configuration limitationsWhen installing FAStT in an HACMP environment, there are some restrictions and guidelines to take into account, which we list here. It does not mean that any other configuration will fail, but it could lead to unpredictable results, making it hard to manage and troubleshoot.

FAStT900

AIX1 AIX2

A B A B

Host Side Dr ive Side

FC Switch FC Switch

Zone1Zone2

rs232

Heartbeats

IP Network

Controller


Applicable pSeries and AIX limitations (not HACMP specific)The following AIX and pSeries restrictions relevant in a HACMP environment apply to FAStT200, FAStT600, FAStT700, and FAStT900 Storage Servers:

� A maximum of four HBAs per AIX host (or LPARs) can be connected to a single FAStT storage server. You can configure up to two HBAs per partition and up to two partitions per FAStT storage server. Additional HBAs can be added to support additional FAStT storage servers and other SAN devices, up to the limits of your specific server platform.

� All volumes that are configured for AIX must be mapped to an AIX host group. Connecting and configuring to volumes in the default host group is not allowed.

� Other storage devices, such as tape devices or other disk storage, must be connected through separate HBAs and SAN zones.

� Each AIX host attaches to FAStT Storage Servers using pairs of Fibre Channel adapters (HBAs):

– For each adapter pair, one HBA must be configured to connect to controller A, and the other to controller B.

– Each HBA pair must be configured to connect to a single partition in a FAStT Storage Server or multiple FAStT Storage Servers (fanout).

– To attach an AIX host to a single or multiple FAStTs with two partitions, two HBA pairs must be used.

� The maximum number of FAStT partitions (host groups) per AIX host per FAStT storage subsystem is two.

� Zoning must be implemented. If zoning is not implemented in a proper way, devices might appear on the hosts incorrectly. Follow these rules when implementing the zoning:

– Single-switch configurations are allowed, but each HBA and FAStT controller combination must be in a separate SAN zone.

– Each HBA within a host must be configured in a separate zone from other HBAs within that same host when connected to the same FAStT controller port. In other words, only one HBA within a host can be configured with a given FAStT controller port in the same zone.

– Hosts within a cluster can share zones with each other.

– For highest availability, distributing the HBA and FAStT connections across separate FC switches minimizes the effects of a SAN fabric failure.

General limitations and restrictions for HACMPKeep in mind the following general limitations and restrictions for HACMP:

� Only switched fabric connections are allowed between the host node and FAStT — but no direct-attach connections.

� HACMP C-SPOC cannot be used to add a FAStT disk to AIX through the “Add a Disk to the Cluster” facility.

� FAStT subsystems with EXP100 disk enclosures are not supported in HACMP configurations at this time.

� Concurrent and Non-Concurrent modes are supported with HACMP Versions 4.4.1 and 4.5 and FAStT running Storage Manager Versions 8.21 or 8.3, including Hot Standby and Mutual Take-over.

� HACMP Versions 4.4.1 and 4.5 are supported on the pSeries 690 LPAR clustered configurations.

Chapter 7. FAStT and HACMP for AIX 131

� HACMP is now supported in Heterogeneous server environments. For more information regarding a particular operating system environment, refer to the specific Installation and Support Guide.

� HACMP clusters can support 2-32 servers per FAStT partition. In this environment, be sure to read and understand the AIX device drivers queue depth settings, as documented in the IBM TotalStorage FAStT Storage Manager 8.4 Installation and Support Guide for AIX, UNIX, and Solaris, GC26-7574.

� Non-clustered AIX hosts can be connected to the same FAStT that is attached to an HACMP cluster, but must be configured on separate FAStT host partitions.

� Single HBA configurations are allowed, but each single HBA configuration requires that both controllers in the FAStT be connected to a switch within the same SAN zone as the HBA. While single HBA configurations are supported, using a single HBA configuration is not recommended for HACMP environments, due to the fact that it introduces a single point of failure in the storage I/O path.


Chapter 8. FAStT and GPFS for AIX

General Parallel File System (GPFS) is a cluster file system providing normal application interfaces and has been available on AIX operating system-based clusters since 1998. GPFS distinguishes itself from other cluster file systems by providing concurrent, very high-speed file access to applications executing on multiple nodes of an AIX cluster.

In this chapter, we describe configuration information for FAStT with GPFS in an AIX environment.

8


8.1 GPFS introductionGPFS for AIX is a high performance, shared-disk file system that can provide fast data access to all nodes in a cluster of IBM UNIX servers, such as IBM Eserver Cluster 1600, pSeries, and RS/6000 SP systems. Parallel and serial applications can easily access files using standard UNIX file system interfaces, such as those in AIX. GPFS allows the creation of a subset of the nodes that make up an AIX cluster, called a nodeset, which is defined as those members of the cluster that are to share GPFS data. This nodeset can include all the members of the cluster.

GPFS is designed to provide high performance by “striping” I/O across multiple disks (on multiple servers); high availability through logging, replication, and both server and disk failover; and high scalability. Most UNIX file systems are designed for a single-server environment. Adding additional file servers typically does not improve the file access performance. GPFS complies with UNIX file system standards and is designed to deliver scalable performance and failure recovery across multiple file system nodes. GPFS is currently available as Versions 1.5 and 2.1, and is available in a number of environments:

� IBM UNIX clusters managed by the Parallel System Support Programs (PSSP) for an AIX licensed program

� An existing RSCT peer domain managed by the Reliable Scalable Cluster Technology (RSCT) component of the AIX 5L™ operating system, beginning with GPFS 2.1

� An existing HACMP cluster managed by the High Availability Multiprocessing (HACMP) licensed program

GPFS provides file data access from all nodes in the nodeset by providing a global name space for files. Applications can efficiently access files using standard UNIX file system interfaces, and GPFS supplies the data to any location in the cluster. A simple GPFS model is shown in Figure 8-1.

Figure 8-1 Simple GPFS model

Application

GPFS

SSA/FCdriver

Application

GPFS

SSA/FCdriver

Application

GPFS

SSA/FCdriver

FC Switch/SSA Loops/.....FC Switch/SSA Loops/.....

ComputeNodes

IP Network

DiskCollection


In addition to existing AIX administrative file system commands, GPFS has functions that simplify multinode administration. A single GPFS multinode command can perform file system functions across the entire GPFS cluster and can be executed from any node in the cluster.

GPFS supports the file system standards of X/Open 4.0 with minor exceptions. As a result, most AIX and UNIX applications can use GPFS data without modification, and most existing UNIX utilities can run unchanged.

High performance and scalabilityBy delivering file performance across multiple nodes and disks, GPFS is designed to scale beyond single-node and single-disk performance limits. This higher performance is achieved by sharing access to the set of disks that make up the file system. Additional performance gains can be realized through client-side data caching, large file block support, and the ability to perform read-ahead and write-behind functions. As a result, GPFS can outperform Network File System (NFS), Distributed File System (DFS™), and Journaled File System (JFS). Unlike these other file systems, GPFS file performance scales as additional file server nodes and disks are added to the cluster.

Availability and recoveryGPFS can survive many system and I/O failures. It is designed to transparently failover locked servers and other GPFS central services. GPFS can be configured to automatically recover from node, disk connection, disk adapter, and communication network failures:

� In a IBM Parallel System Support Programs (PSSP) cluster environment, this is achieved through the use of the clustering technology capabilities of PSSP in combination with the PSSP Recoverable Virtual Shared Disk (RVSD) function or disk-specific recovery capabilities.

� In an AIX cluster environment, this is achieved through the use of the cluster technology capabilities of either an RSCT peer domain or an HACMP cluster, in combination with the Logical Volume Manager (LVM) component or disk specific recovery capabilities.

GPFS supports data and metadata replication, to further reduce the chances of losing data if storage media fail. GPFS is a logging file system that allows the re-creation of consistent structures for quicker recovery after node failures. GPFS also provides the capability to mount multiple file systems. Each file system can have its own recovery scope in the event of component failures.

All application compute nodes are directly connected to all storage possibly via a SAN switch. GPFS allows all compute nodes in the node set to have coherent and concurrent access to all storage. GPFS provides a capability to use the IBM SP Switch and SP Switch2 technology instead of a SAN. GPFS uses the Recoverable Virtual Shared Disk (RVSD) capability currently available on the RS/6000 SP and Cluster 1600 platforms. GPFS uses RVSD to access storage attached to other nodes in support of applications running on compute nodes. The RVSD provides a software simulation of a Storage Area Network over the SP Switch or SP Switch2.

8.2 Supported configurationsGPFS runs on an IBM Eserver Cluster 1600 or a cluster of pSeries nodes, the building blocks of the Cluster 1600. Within each cluster, your network connectivity and disk connectivity varies depending upon your GPFS cluster type.

Table 8-1 summarizes network and disk connectivity per cluster type.

Chapter 8. FAStT and GPFS for AIX 135

Table 8-1 GPFS clusters: Network and disk connectivity

The SP cluster type and supported configurationsThe GPFS cluster type SP is based on the IBM Parallel System Support Programs (PSSP) licensed product and the shared disk concept of the IBM Virtual Shared Disk component of PSSP. In the GPFS cluster type SP (a PSSP environment), the nodes that are members of the GPFS cluster depend on the network switch type being used. In a system with an SP switch, the GPFS cluster is equal to all of the nodes in the corresponding SP partition that has GPFS installed. In a system with an SP Switch2, the GPFS cluster is equal to all of the nodes in the system that have GPFS installed. That is, the cluster definition is implicit and there is no need to run the GPFS cluster commands. Within the GPFS cluster, you define one or more nodesets within which your file systems operate.

In an SP cluster type, GPFS requires the Parallel System Support Programs (PSSP) licensed product and its IBM Virtual Shared Disk and IBM Recoverable Virtual Shared disk components (RVSD) for uniform disk access and recovery.

For the latest supported configurations, refer to:

ftp://ftp.software.ibm.com/storage/fastt/fastt500/PSSP-GPFS_config_info.pdf

RPD cluster type and supported configurationsThe GPFS cluster type RPD is based on the Reliable Scalable Cluster Technology (RSCT) subsystem of AIX 5L. The GPFS cluster is defined over an existing RSCT peer domain. The nodes that are members of the GPFS cluster are defined with the mmcrcluster, mmaddcluster, and mmdelcluster commands. With an RSCT peer domain, all nodes in the GPFS cluster have the same view of the domain and share the resources within the domain. Within the GPFS cluster, you define one or more nodesets within which your file systems operate.

In an RPD cluster type, GPFS requires the RSCT component of AIX. The GPFS cluster is defined on an existing RSCT peer domain with the mmcrcluster command.

For the latest supported configurations, refer to:


HACMP cluster type and supported configurationsThe GPFS cluster type HACMP is based on the IBM High Availability Cluster Multiprocessing/Enhanced Scalability for AIX (HACMP/ES) licensed product. The GPFS cluster is defined over an existing HACMP cluster. The nodes which are members of the

GPFS cluster type Network connectivity Disk connectivity

SP - PSSP SP Switch or SP Switch2 Virtual shared disk server

RPD - RSCT Peer Domain An IP network of sufficient network bandwidth (minimum of 100 Mbps)

Storage Area Network (SAN)-attached to all nodes in the GPFS cluster

HACMP An IP network of sufficient network bandwidth (minimum of 100 Mbps)

SAN-attached to all nodes in the GPFS cluster

Important: Before installing FAStT in a GPFS environment, always read the AIX readme file and the FAStT readme for the specific Storage Manager version and model.




GPFS cluster are defined with the mmcrcluster, mmaddcluster, and mmdelcluster commands. Within the GPFS cluster, you define one or more nodesets within which your file systems operate.

In an HACMP cluster type, GPFS requires the IBM HACMP/ES licensed product over which the GPFS cluster is defined with the mmcrcluster command.

General configuration limitationsThere are some general limitations to follow when configuring FAStT Storage Servers in a GPFS environment:

� The FAStT200 is not supported in RVSD or GPFS on HACMP cluster configurations.

� FAStT subsystems with EXP100 disk enclosures are not supported in RVSD or GPFS configurations at this time.

� RVSD and GPFS Clusters are not supported in Heterogeneous Server configurations.

� Only switched fabric connection, no direct connection, is allowed between the host node and FAStT.

� Each AIX host attaches to FAStT Storage Servers using pairs of Fibre Channel adapters (HBAs):

– For each adapter pair, one HBA must be configured to connect to controller A, and the other to controller B.

– Each HBA pair must be configured to connect to a single partition in a FAStT Storage Server or multiple FAStT Storage Servers (fanout).

– To attach an AIX host to a single or multiple FAStTs with two partitions, 2 HBA pairs must be used.

� The maximum number of FAStT partitions (host groups) per AIX host per FAStT storage subsystem is two.

� A maximum of four partitions per FAStT for RVSD and HACMP/GPFS clusters configurations.

� RVSD clusters can support a maximum of two IBM Virtual Shared Disk and RVSD servers per FAStT partition.

� HACMP/GPFS clusters can support 2-32 servers per FAStT partition. In this environment, be sure to read and understand the AIX device drivers queue depth settings as documented in the IBM TotalStorage FAStT Storage Manager 8.4 Installation and Support Guide for AIX, UNIX, and Solaris, GC26-7574.

� Single Node Quorum is not supported in a two-node GPFS cluster with FAStT disks in the configuration.

� SAN Switch zoning rules:

– Each HBA within a host must be configured in a separate zone from other HBAs within that same host when connected to the same FAStT controller port. In other words, only one HBA within a host can be configured in the same zone with a given FAStT controller port.

– The two hosts in a RVSD pair can share zones with each other.

� For highest availability, distributing the HBA and FAStT connections across separate FC switches minimizes the effects of a SAN fabric failure.

� No disk (LUN on FAStT) can be larger than 1 TB.

� You cannot protect your file system against disk failure by mirroring data at the LVM level. You must use GPFS replication or RAID devices to protect your data (FAStT RAID levels).

Chapter 8. FAStT and GPFS for AIX 137

Part 3 VMware and FAStT

In this part of the book, we discuss VMware ESX Server 2.1 and present configuration options and special considerations for attachment to FAStT Storage Server.

Part 3


Chapter 9. Introduction to VMware

This chapter starts with a brief overview of VMware, Inc., its products, and specifically the VMware ESX architecture. The chapter then focuses on the practical aspects of implementing VMware ESX Server on IBM eServer xSeries servers with IBM TotalStorage FAStT attached disk as well as looking at IBM eServer BladeCenter considerations.

The following topics are covered:

� VMware, Inc. and the IBM and VMware relationship

� VMware ESX Server v2.1 in detail

� The storage structure of VMware ESX Server - disk virtualization:

– Local disk and SAN disk usage

� Disk virtualization with VMFS volumes and .dsk files:

– VMFS access modes, ESX Server .dsk modes, and the Buslogic® and LSI SCSI Controllers

� FAStT and ESX Server solution considerations:

– Which model of FAStT should be used in a VMware implementation?

� FAStT tuning considerations:

– RAID array types, logical drives, segment size, and other settings

� IBM eServer BladeCenter and ESX Server

9

Attention: These chapters describe potential configurations and best practices for attaching ESX Server(s) to a FAStT. At the time of writing, no FAStT models had completed StorageProven certification with ESX Server v2.1. This information should not be used as a substitute for this certification. Be sure to check with the FAStT interoperability matrix for official support of your chosen configuration at:

http://www.storage.ibm.com/disk/fastt/supserver.htm.



9.1 VMware, Inc.VMware, Inc. (http://www.vmware.com) was founded in 1998 to bring virtual machine technology to industry-standard computers. The company now offers the following virtual machine or vPlatform products:

� VMware Workstation: Allows you to run multiple operating systems (OS) and their applications simultaneously on a single Intel based PC that already has Windows or Linux OS installed; VMware Workstation installs like an application.

� VMware GSX Server: Runs within an OS also, but requires a server OS like Linux or Windows Server; it transforms the physical computer into a pool of virtual machines. Operating systems and applications are isolated in multiple virtual machines that reside on a single piece of hardware. System resources are allocated to any virtual machine based on need, delivering maximum capacity utilization and control over the computing infrastructure. VMware GSX Server can host up to 64 virtual machines (provided that the server hardware can sustain the load).

� VMware ESX Server: Virtualization software that enables the deployment of multiple, secure, independent virtual machines on a single physical server. It runs directly on the hardware in contrast to VMware Workstation and GSX Server products, which utilize host operating system to access hardware. Also unlike the other VMware products, ESX Server supports dynamic management of memory, CPU, disk, and network traffic. VMware ESX Server can host up to 80 virtual machines (provided that the server hardware can sustain the load). These virtual machines are also now each capable of running in a 2-processor configuration (the guest OS seeing 2 CPUs) when using the Virtual SMP add-on.

In addition, VMware has also recently added extra management and migration products to its lineup; a brief description of the vManage and vTools products is as follows:

� VMware VirtualCenter: Is the management tool for managing an ESX environment where multiple ESX Server installations exist. This tool gives complete centralized control of the ESX Servers' resources and can also be used for deploying new virtual machines with guest OS’ on any ESX Server via templates/images that have been created. In addition, the VMotion add-on allows for migration of a running virtual machine from one ESX Server to another (that is attached to the same shared disk), for continuous operation where a physical server may need to be taken down for maintenance, or the load lightened on a heavily utilized ESX Server.

� VMware P2V Assistant: Helps to migrate a server environment from a physical server running a single Windows NT4 or Windows 2000 OS instance natively, into an ESX Server virtual machine. The P2V Assistant creates a clone image of the original in the form of bootable virtual disk that ESX will recognize.

For the purpose of this chapter, only ESX Server version 2.1 from VMware is being considered, although the best practices for a multiple ESX Server installation that runs VirtualCenter and VMotion will also be discussed.

9.2 The IBM and VMware relationshipThe IBM and VMware relationship started in December 2000 when IBM joined the VMware Preferred Partner Program as a Charter Member. Since this time there have been many announcements regarding the strong relationship between the two organizations including when IBM became the first tier-one hardware vendor to sign a Joint Development Agreement (JDA) with VMware back in February 2002. Most recently, this strategic alliance was officially extended through to 2007.


http://www.vmware.com

To compliment IBM eServer xSeries, BladeCenter, and TotalStorage solutions, IBM directly resells VMware software. In addition to offering outstanding global service and support for the VMware products. IBM is the only hardware vendor offering complete worldwide support for Windows, Linux, and key applications all within the VMware environment, providing a single point of contact for support of the entire virtualized solution.

The following VMware and related products are available from IBM:

� VMware ESX Server (and ESX Server processor licence upgrades), including a one-year software subscription for all ESX updates released in that year

� VMware Virtual SMP (and Virtual SMP processor licence upgrades), including a one-year software subscription for all Virtual SMP updates released in that year

� VMware VirtualCenter, including Management Server, Agent, VMotion, and Virtual Infrastructure Node (licence bundle of ESX, SMP+VirtualCenter Agent+VMotion)

� Remote Technical Support (RTS) services - ServicePac® for Support Line for VMware on xSeries (IBM part numbers, 96P2704, 96P2705, 96P2706)

This ServicePac provides customers with one year of 24 x 7 remote telephone based software support for questions, problems, or queries concerning the operation and performance of the VMware product and guest operating system environments running on compatible IBM xSeries servers. The service provides the “one stop shop” for customers investing in VMware on xSeries by also providing support on the Microsoft and Linux Operating Systems within the VMware virtual machines. See:

http://www.ibm.com/services/its/us/servicepac.html

For more detail on the IBM and VMware relationship, see:

http://www.vmware.com/partners/hw/ibm.html http://www.pc.ibm.com/ww/eserver/xseries/vmware.html

9.3 VMware ESX Server v2.1 architectureVMware ESX Server is a virtual infrastructure partitioning software designed for server consolidation, rapid deployment of new servers, increased availability, and simplified management — helping to improve hardware utilization, save space, IT staffing and hardware costs.

Many people may have had earlier experience with VMware's virtualization products in the form of VMware Workstation or VMware GSX Server. As aforementioned, VMware ESX Server is quite different to other VMware products in that it runs directly on the hardware, offering a mainframe class virtualization software platform that enables the deployment of multiple, secure, independent virtual machines on a single physical server.

ESX Server allows several instances of operating systems like Microsoft Windows NT 4.0, Windows 2000 Server, Windows Server 2003, Red Hat and (Novell) SuSE Linux, Novell Netware and more, to run in partitions independent of one another. Therefore this technology is a key software enabler for server consolidation that provides the ability to move existing, unmodified applications and operating system environments from a large number of older systems onto a smaller number of new high performance xSeries platforms.

Note: While support for VMware is available as a separate part number, it is not optional. It is absolutely mandatory based on the contractual agreement between VMware and IBM.

Chapter 9. Introduction to VMware 143

http://www.ibm.com/services/its/us/servicepac.html

http://www.vmware.com/partners/hw/ibm.html

http://www.pc.ibm.com/ww/eserver/xseries/vmware.html

Real cost savings can be achieved by allowing for a reduction in the number of physical systems to manage, saving floor space, rack space, reducing power consumption, and eliminating the headaches associated with consolidating dissimilar operating systems and applications that require their own OS instance.

The architecture of VMWare ESX Server is shown in Figure 9-1.

Figure 9-1 VMware ESX Server architecture

Additionally, ESX Server helps you build cost-effective, high-availability solutions by using failover clustering between virtual machines. Until now, system partitioning (the ability of one server to run multiple operating systems simultaneously) has been the domain of mainframes and other large midrange servers. But with VMware ESX Server, dynamic, logical partitioning can be enabled on IBM xSeries™ systems.

Instead of deploying multiple servers scattered around a company and running a single application on each, they can be consolidated together physically, while enhancing system availability at the same time. VMware ESX Server allows each xSeries server to run multiple operating systems and applications in virtual machines — providing centralized IT management. Since these virtual machines are completely isolated from one another, if one were to go down, it would not affect the others.

This means that not only is VMware ESX Server software great for optimizing hardware usage, it can also give the added benefits of higher availability and scalability.

VMware ESX ServerVirtualization layer

Physical Intel architecture hardware

Virtual machine

Application Application

Operating system

Virtual hardware

Console OS

Software

Hardware

Resource manager

Virtual machine

Application Application

Operating system

Virtual hardware


The features and benefits of VMware ESX Server are shown in Table 9-1.

Table 9-1 Features and benefits of VMware ESX Server

VMware ESX Server can provide all of these functions across a large and diverse range of xSeries servers from a dual-processor HS20 server blade, all the way to the highly scalable x445 server. For an up-to-date list of xSeries servers that are certified to run VMware, go to:

http://www.pc.ibm.com/us/compat/nos/vmware.html

And, for blades, go to:

http://www.pc.ibm.com/us/compat/nos/vmwaree.html

Features Benefits

Isolation � Multiple applications can share the same hardware and be completely protected from one another.

� Application isolation eliminates conflicts among applications, yet still allows customers to support multiple applications on the same system.

� Service Providers can maintain independent customer environments as if they were on physically separate machines.

� Fault isolation provides the high availability of separate, independent systems.

Disaster Recovery Cost effective way to back up entire OS and application images.

Remote Management Troubleshoot and manage system remotely.

Resource Management Dynamically manage the resources consumed by a specific environment.

Legacy application support Provision new hardware, yet preserve their investment in legacy applications.

Encapsulation Create an application environment and provision the application quickly to multiple users or systems.

Provisioning Introduce new applications faster than ever before, while meeting Service Level Agreements (SLA) and security requirements.

Portability of computing environment

Provision resources instantly to meet SLAs, regardless of the hardware configuration.

Dynamically assignable resources

Fully utilize and optimize system resources to customer needs.

Logical sub-CPU partitioning

Make optimum use of existing CPU resources, especially for I/O intensive applications.

Performance guarantee Ensure that SLAs can be met; applications can't steal resources from one another.

Hot standby of diverse operating environments

Delivers high availability: N+1 fail over for a diverse set of operating environments without running a separate instance of each application.


http://www.pc.ibm.com/us/compat/nos/vmware.html


9.4 VMware ESX Server storage structure: disk virtualizationIn addition to the disk virtualization that is offered by a SAN (Figure 9-2), VMware further abstracts the disk subsystem from the guest OS. It is important to understand this structure in order to make sense of the options for best practices when connecting VMware ESX Servers to a SAN attached FAStT subsystem.

Figure 9-2 FAStT disk virtualization

The following sections will step through the various levels of ESX Server disk virtualization and bring the whole picture together in Figure 9-7 on page 150.

9.4.1 Local disk usageThe disks that ESX Server will use for its boot partition for the console OS (management OS) are usually local disks that will have a partition/file structure akin to the Linux file hierarchy (Figure 9-3). Providing that SCSI disks are used, a partition formatted with the VMware ESX Server File System (VMFS) can be located on this boot disk and may be used for the VMware guest OS Swap File and the VMware Core Dump partition (local disk for these components is recommended for SAN attached ESX Servers). A VMFS on local SCSI disks can also be used for virtual machine disk (.dsk) files, although it is recommended that these only be lightly used, such as for storage of a guest OS image.

Figure 9-3 The ESX Server Console OS disks and partition

FAStT SAN Controller

Disk5Disk4Disk3Disk2Disk1Disk0 Disk6

Logical Vol 0 Logical Vol 2

RAID5 RAID10

Logical Vol 1

Disk7

Logical Vol 3

Physical D

isksR

AID

A

rraysLogical D

rives

Vmware ESX Server

Console OS

Local Disks

Disk0 Disk1

RAID1

vmhba0:0:0

Other VMFS Volumes

Partitions 1-4/, root, etc

VMFS Vol0


9.4.2 SAN disk usageESX Server v2.1 introduced enhanced support for SAN based disks. The way in which SAN disk is used in ESX Server is as follows (see Figure 9-4):

� Once the FAStT storage server has been configured with arrays, logical drives, and storage partitions, these logical drives will be presented to the ESX Server.

� Two options exist for the use of these logical drives within ESX Server:

– Option 1: Formatting these disks with the VMFS: This option is most common, as a number of features require the virtual disks to be stored on VMFS volumes

– Option 2: Passing the disk through to the guest OS as a raw disk: No further virtualization occurs; the OS will write its own file system onto that disk directly, just as it would in a standalone environment, without an underlying VMFS structure.

� The VMFS volumes house the virtual disks that the guest OS will see as its real disks. These are in the form of what is effectively a file with the extension .dsk.

� The guest OS will either read/write to the virtual disk file (.dsk) or write through the ESX Server abstraction layer to a raw disk. In either case the guest OS will consider the disk to be real.

Figure 9-4 FAStT logical volumes to VMware VMFS volumes

9.4.3 Disk virtualization with VMFS volumes and .dsk filesThe VMware ESX Server File System (VMFS) is the file system designed by VMware specifically for the ESX Server environment. It is designed to format very large disks (LUNs) and store the virtual machine .dsk files which can also be very large. The VMFS volumes store:

� Virtual machine .dsk files

� The memory images from virtual machines that have been suspended

� Redo files for the .dsk files that have been set to a disk mode of nonpersistent, undoable, or append (see ESX Server .dsk modes, Figure 9-6 on page 149)

The virtual machine .dsk files represent what is seen as a physical disk by the guest OS (see Figure 9-5). These files have a number of distinct benefits over physical disks (although some of these functions are available through the advanced functions of a FAStT):

� They are portable so that they can be copied from one ESX Server to another either to move a virtual machine to that ESX Server or to create backup or test environments. When copied, they retain all of the structure of the original so that if it is the virtual machine's boot disk, then it will include all of the hardware drivers necessary to make it run on another ESX Server (although the .vmx configuration file also needs to be replicated in order to complete the virtual machine).


VMFS Volume1 VMFS Volume3

Logical Vol 1

VMFS Volume2

vmhba1:0:1 vmhba1:0:3vmhba1:0:2

VM1_0.dsk VM2_0.dsk VM2_2.dskShared.dsk VM3_0.dsk

Logical Vol 3

vmhba1:0:4

Logical D

rivesE

SX P

hysical D

isksE

SX

Partitions

Raw disk

access


� They can be easily resized (using vmfsktools) if the virtual machine needs more disk space. This option will present a larger disk to the guest OS that will also require some way of accessing the additional space for that disk by a volume expansion tool like Powerquest Volume Manager.

� They can be mapped and remapped on a single ESX Server for the purposes of keeping multiple copies of a virtual machine's data. Many more .dsk files can be stored for access by an ESX Server than are represented by the number of virtual machines configured.

Figure 9-5 VMFS volumes and .dsk files

VMFS access modesVMFS formatted partitions have two access modes, public and shared. These two modes function in the following ways:

� Public: This access function varies depending upon the version of the VMFS volume (VMFS-1 or VMFS-2), and is reliant on the partition residing on shared storage. VMFS-1 partitions allow multiple ESX Servers to access the VMFS volume, but only one at a time. VMFS-2 partitions allow multiple ESX Servers to access the VMFS volume concurrently and use file locking to prevent contention on the .dsk files. Public is the default mode.

� Shared: This access mode allows a VMFS volume to be used in virtual machine clustering across multiple ESX Servers. VMware require that only one .dsk file per VMFS volume exists for shared VMFS to avoid any SCSI locking issues. This is because in shared mode all SCSI locking is handled by the virtual machine(s), not by the ESX Servers. If more than one .dsk file exists, both will be locked by the same virtual machine when only one is required to be locked by that virtual machine.

ESX Server .dsk modesESX Server has four modes of operation for .dsk file disks that can be set from within the ESX Server management user interface (MUI) and will be seen during creation of the .dsk files or afterwards by editing an individual virtual machine's settings. The four modes are as follows:

� Persistent: Just like normal physical disks in a server, ESX Server writes immediately to a persistent disk

� Nonpersistent: Changes that were made since the last time a virtual machine was powered on are lost when that VM is powered off (soft reboots do not count as being powered off).

� Undoable: Changes that were made since the last time a virtual machine was powered on are written to a log file (redo.log) for that VM. When that VM is powered off, ESX will ask whether to commit or discard the changes that were made during that session.

VM3


VM2VM1

VMFS Volume2

OS

D

isksDisk0Disk0 Disk1 Disk2Disk0 Disk1


VM4

Disk0

Buslogic/LSI SCSI Emulation Driver

EES

X

Partitions

Raw disk

access


� Append: Similar to undoable, changes that are made are written to a log file (redo.log) for that VM since the append mode is activated. These changes can be written to the .dsk file using the commit command in vmkfstools. This differs from undoable in that the VM can be powered off multiple times without being asked to commit or discard the changes.

A diagram showing the write function of these modes is given in Figure 9-6.

Figure 9-6 ESX Server .dsk modes

Here are some examples of situations where the ESX .dsk modes might be used:

� Persistent: During normal operation.

� Nonpersistent: Where a server environment may need to be recovered to an original state very quickly, such as a Web server with static pages that has had some of those pages hacked. Simply powering the VM off and on again will reset the entire configuration. Training centers also find a use for this disk mode by having the ability to reset a training environment to a specific point simply by clicking a button.

� Undoable: Where a server needs the option of rollback, such as when applying a software service pack. The service pack can be installed and the server tested to see if the installation of the service pack is acceptable. If the installation worked, the changes can be committed; if the installation had problems, the original (pre-service pack installation) environment can be recovered by discarding the changes when the VM is powered off. This even includes the option of rolling back from changes that have caused a Windows blue-screen, an option that is not readily available outside of the virtualized environment.

� Append: Similar to undoable, with a more long term rollback option.

9.4.4 The Buslogic and LSI SCSI controllersVMware ESX Server presents virtual device to the guest OS either a Buslogic or LSI SCSI controller; one of these virtual devices is used within the OS as if it were a physical disk controller. This also serves to highlight another benefit of VMware in that the hardware presented is consistent within all VMs which adds to the portability of a VM when the .dsk file is copied from one ESX Server to another. In this case the virtual hardware remains the same for the guest OS regardless of whether the physical hardware of the ESX is the same or not.

There are two types of SCSI controllers that are presented to the virtual machine depending upon the type of guest OS. Windows Server 2003 uses the LSI Logic controller driver that is included in the Windows Server 2003 installation files. All other supported guest operating systems use the Buslogic controller driver that is included in their installation files.

VM1.dsk VM1.redo VM1.dsk VM1.redo VM1.dsk VM1.redoVM1.dsk

OS Disk OS Disk OS Disk OS Disk

Commit at shutdown?

Commit via command?

Discard changes

Persistent Nonpersistent Undoable Append

Boot

Write

Boot

Write

Boot

Write

Boot

Write


It is possible to use the LSI Logic controller driver for Windows 2000 and Windows XP virtual machines, but this requires download of the LSI driver from the LSI Logic Web site. It is also possible to use the Buslogic adapter driver for Windows Server 2003 and Windows XP, but this driver needs to be downloaded from the VMware download Web site. More details about these two controllers, their drivers, and their driver update procedures can be found in the ESX Server 2.1 Administration Guide.

In reality there is little need to change from the default SCSI controller type; however, updating the Buslogic driver in Windows 2000 to the VMware supplied driver will improve virtual disk performance, and is recommended.

9.4.5 The complete picture of ESX and FAStT storageHaving looked at the components that make up the logical disk structure of VMware ESX Server, the complete picture is represented in Figure 9-7.

Figure 9-7 The logical disk structure of ESX Server

Note: With ESX Server 2.1.1, if you have configured a virtual machine to directly access a raw LUN on a SAN, and the virtual machine is using the virtual BusLogic SCSI adapter, you will see one or more resets on the SAN whenever that virtual machine boots.

This behavior occurs because the BusLogic driver always issues a SCSI bus reset when it is first loaded, and this SCSI bus reset is translated to a full SAN reset.

FAStT SAN Controller

Vmware ESX Server

Console OS VM3

Local Disks

Disk0 Disk1

RAID1

Disk5Disk4Disk3Disk2Disk1Disk0 Disk6

vmhba0:0:0



VM2VM1

RAID5 RAID10

Logical Vol 1

VMFS Volume2

vmhba1:0:1 vmhba1:0:3vmhba1:0:2

OS

D

isksDisk0Disk0 Disk1 Disk2Disk0 Disk1


Disk7

Logical Vol 3

VM4

Disk0

Buslogic/LSI SCSI Emulation Driver

vmhba1:0:4

Physical D

isksR

AID

A

rraysLogical D

rivesES

X P

hysical D

isksE

SX

Partitions

Partitions 1-4/, root, etc

VMFS Vol0

Raw disk

access


9.5 FAStT and ESX Server solution considerationsIn this section, we review best practices for an ESX Server or multiple servers attached to a FAStT, starting with some general performance and sizing considerations.

9.5.1 General performance and sizing considerationsThe goal of this section is not to describe an approach for sizing and performance, but rather to point out specific characteristics of a VMware ESX Server implementation.

When it comes to performance it is important to remember that one should not automatically expect a virtual machine to exhibit the same performance characteristics of the physical server it emulates.

This is not to say that a virtual machine cannot cope with performance intense workloads. However, if achieving the highest performance is a major goal or requirement, VMware might not be the right choice. The same goes for workloads requiring large SMP systems (typically of more than two CPUs).

In any case it is important to agree on the minimum acceptable performance figures, then document it to perform a Proof of Concept (POC), if performance is the main concern.

CPU overheadThe virtualization process introduces a CPU overhead that needs to be considered when sizing VMware solutions. The percentage of overhead depends on the nature of the workload. As a general guideline (and from numbers observed with actual implementations) you can use the following rule of thumb approach:

� Computation intense workload: overhead negligible (1-3%)� Disk I/O intense workload (20-30%)� Network I/O intense workload (30-40% or even greater)

In reality, you will typically have a mixed workload possibly resulting in an average overhead of 20-30%.

Network throughputAs with any other operating system, an instance of ESX Server has a given maximum network throughput capability. Adding faster CPUs or adding more and more network adapters has either very limited or no effect at all. Typically a single Gbit network adapter can achieve the maximum network throughput of a server (true for Windows 2000, for example, and also VMware).

Again, there is no single figure for the maximum throughput of an ESX Server as it depends on specific characteristics of the network traffic, such as the transfer request size for instance. Observations indicate that a figure around 300 Mbps is a good assumption (regardless of the number of virtual machines or number of physical and virtual network adapters).

Take the following example: You want to host or consolidate 20 servers with ESX. The IBM xSeries x445 is the hardware platform you decided to use. Considering the CPU overhead that the virtualization introduces, you have determined that you need 16 physical CPUs to run your 20 servers. The systems will be relatively network intense, and you expect the average network throughput of each server to be 30Mbps, requiring a total of around 600 Mbps.


In this case you should not consider the deployment of a single 16-way x445 (as it will probably not be able to deliver the required network throughput, but rather opt for two 8-way or possibly even four 4-way systems.

Again, we are not saying that large SMP systems are not a good choice for ESX Server implementations. They obviously can be, but depending on the case, you might need to consider alternatives and compromise the management advantages that come with fewer larger systems.

If using VMotion it is recommended to add an additional adapter (or pair) for dedicated traffic.

Console OS performanceIt is important to understand that the console operating system is not a performance VM. Actually, it is deliberately tuned to use minimum resources to allow better performance of production guest OS. Disk I/O, for instance, is approximately 1/10th in the Console OS compared to a Guest OS. This becomes obvious when trying to copy .dsk files from within the console. It is slow, but working as designed.

9.5.2 Which model of FAStT should be used in a VMware implementation?Unfortunately, there is no magic answer to this question. All of the FAStT storage servers could provide excellent functionality for the purpose of attaching to ESX Servers. The answers lie in the specific requirements that an ESX Server is intended for and the expectations that need to be met in terms of performance, availability, capacity, and so on. One thing is certain; the sizing requirements for capacity and performance do not change when an ESX Server is being considered as opposed to a group of individual physical servers. Some consolidation of SAN requirements can be achieved, other requirements remain.

For example, because of under-utilization, great consolidation is often possible in regards to the number of physical HBAs that may be required and therefore also the number of SAN switch ports that are also required for connection of those HBAs. As both of these items come at a considerable cost, any reduction in the number required can represent significant savings. It is also common to find low bandwidth utilization of HBAs and SAN switch ports in a non-consolidated environment, thus also adding to the potential for consolidation of these items.

On the other hand, it is common that individual physical disk utilization is high, and therefore reducing the number of physical disks is often not appropriate. This will be covered in more detail under the section entitled “RAID array types” on page 153.

Like in all SAN implementations, consideration must then be given to both the immediate requirement of the project and the possibilities for reasonable future growth. More detail regarding these considerations and others can be found in 2.1, “Planning your SAN” on page 14.

Tip: Prior to ESX Server 2.1, many systems were configured with large numbers of physical network adapters (often with the use of RXE-100 option) to securely separate network traffic between virtual machines. With the VLAN capabilities of ESX Server 2.1, this is no longer required. A typical server running ESX Server 2.1 or greater would have a total of three network adapters, one dedicated for the console OS and two adapters in a bond (adapter team) for virtual machine traffic, separated by VLANs if required. Adding additional adapters would not overcome the maximum throughput limitation.


For a further description of the IBM TotalStorage FAStT models available, and their relevant details, please refer to 1.1, “FAStT features and models” on page 4.

9.5.3 FAStT tuning considerations

RAID array typesThere is a popular misconception that simply adding up the amount of storage required for the number of servers that will be attached to a SAN is good enough to size the SAN. The importance of understanding performance as well as capacity requirements has been covered in 2.3.5, “Planning your storage structure and performance” on page 27, but is even more relevant to the VMware environment, as the concept of server consolidation is also thrown into the equation. Figure 9-8 and Figure 9-9 look at consolidating four physical servers into a single VMware ESX Server environment to explain the considerations.

Figure 9-8 Unrealistic storage consolidation

In Figure 9-8, an attempt is made to take the capacity requirement that is calculated from the four existing servers and use that as a guide to size a single RAID-5 array for the purpose of hosting all four virtual environments.

It is extremely unlikely that assigning a single RAID-5 LUN to the ESX Server host in this way would supply enough disk performance to service the virtual machines adequately.

Note: While the following guidelines will help in increasing the performance of a VMware ESX Server environment, it is important to realize that the overhead of the VMware ESX virtualization layer will still exist. In cases where 100% of the native or non-virtualized performance is required, then an evaluation as to the practicality of a VMware environment should be undertaken.

Existing configuration:

Consolidated to:

RAID5

VMware ESX Server

Web DBFileMail

WebMailFileDB

RAID1RAID1 RAID5 RAID5 RAID5

RAID1RAID1


While an assessment of the performance of the individual environments may show that there is room for consolidation with smaller applications, the larger applications (mail, or DB, f.i.) will require that similar disk configurations are given to them in a SAN environment as they had in the previous physical environment.

Figure 9-9 illustrates that a certain amount of storage consolidation may indeed be possible, as mentioned, without ignoring the normal disk planning and configuration rules that apply for performance reasons. Servers with a small disk I/O requirement may be candidates for consolidation onto a fewer number of LUNs, however, servers that have I/O intensive applications will require disk configurations similar to those of their physical counterparts. It may not be possible to make precise decisions as to how to best configure the RAID array types and which virtual machine disks should be hosted on them until after the implementation. In a FAStT environment, it is quite safe to configure some of these options later through the advanced dynamic functions that are available on the FAStT storage servers.

Figure 9-9 Potentially realistic storage consolidation

These changes might include adding more disks (capacity) to an array using the Dynamic Capacity Expansion function and joining two VMFS volumes together in a volume set, or changing the array type from RAID5 to RAID10 using the Dynamic RAID-Level Migration function, or changing the segment sizing to better match our application using the Dynamic Segment Sizing function. Information about all of these dynamic functions and more, and their applications, can be found in 4.2.1, “Modification operations” on page 80.

Note: Not all FAStT dynamic functions are supported in a VMware environment. For example, dynamic volume expansion increases the size of a LUN that is presented to the ESX Server, which may cause problems when addressing that LUN, as its parameters will have changed. Check with the FAStT Storage interoperability matrix at:


Existing configuration:

RAID1RAID1 RAID5 RAID5

Consolidated to:

VMware ESX Server

RAID5

RAID1RAID1

Web DBFileMail

WebMailFileDB

RAID10 RAID5RAID1RAID5



Logical drivesCarving a FAStT array into multiple logical drives is optional, but again, consideration needs to be made as to the performance impact of the hosted virtual machines that will do all read/write to the same set of physical disks.

Both of these points are reflected in the logical disk structure of ESX Server as shown in Figure 9-7 on page 150.

Segment sizing and cache settingsWhile the VMFS file system has a minimum formatted file block size of 1MB and a minimum I/O size of 512 Bytes, the tuning of the segment size for each logical volume does not depend upon these figures. For tuning of the LUN segment size (or cache settings, or other settings), consider using the recommended setting for the OS and application that will run from the hosted virtual machine. For more detail regarding FAStT segment size tuning according to application, refer to the IBM Redbook: Tuning IBM eServer xSeries Servers for Performance, SG24-5287. You can find a softcopy at:

http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg245287.pdf

Other settingsOther specific FAStT configuration settings, including step-by-step instructions, will be covered in Chapter 12, “Installing VMware ESX Server” on page 197.

9.6 IBM eServer BladeCenter and ESX ServerBoth VMware and BladeCenter allow for large-scale consolidation of server environments and have been seen as somewhat competing technologies. In reality however, the BladeCenter actually compliments a FAStT and VMware solution by further reducing the physical space required for the physical servers, while increasing the management and availability of the physical server environment also.

VMware is IBM ServerProven® (http://www.pc.ibm.com/us/compat/nos/vmwaree.html) for both the HS20 dual Xeon processor blade and the HS40 quad Xeon MP processor blade, giving complete flexibility in a range from a small scale VMware implementation on a single HS20 through to an enterprise level implementation on many HS40 blades. In conjunction, if the power of individual two or four processor blades is needed, then these can cohabit with blades running VMware in the same BladeCenter chassis, reducing the number of different physical server models and giving the BladeCenter benefits to all servers, whether virtualized or not.

Some unique configuration is needed in some scenarios when running VMware on blades. This is covered in the section“BladeCenter specifics” on page 196 and in Chapter 12, “Installing VMware ESX Server” on page 197.

The following sections take a look at the IBM BladeCenter and its relation to FAStT storage.

Note: The best practice of not partitioning a single logical drive LUN into multiple VMFS volumes in ESX is recommended by VMware for ESX Server v2.1. This will avoid any possibility of conflicting SCSI reservations to a FAStT LUN impacting I/O to other virtual machines housed on the same array LUN.


http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg245287.pdf


9.6.1 Introduction to the IBM eServer BladeCenterBlade servers are a relatively new technology that has captured industry focus because of their high density, high power, and modular design, which can reduce cost. This cost reduction comes with a more efficient use of valuable floor space, reduced network and power infrastructure requirements, and simplified management.

The IBM eServer BladeCenter is a 7U modular chassis that takes advantage of advancements in server technology. It is capable of housing up to 14 functionally separate blade servers. The BladeCenter chassis allows individual blade servers to share resources such as power, switch, management and cooling modules.

The BladeCenter design combines:

� IBM Enterprise X-Architecture™ technology:

IBM Enterprise X-Architecture technology leverages proven innovative IBM technologies to build powerful, capable, reliable Intel-processor-based servers.

Enterprise X-Architecture technology includes features such as Predictive Failure Analysis® (PFA) and Advanced System Management.

� Expansion capabilities:

Blades can be added to the BladeCenter unit as needed, up to a maximum of 14 blades.

IBM blade servers have connectors for options that can be used to add capabilities to the blade, such as an I/O expansion card to add a network interface, or a storage expansion unit to add SCSI hard disk drives. Note that companies like Cisco, Brocade, Qlogic and others have built products specifically for the IBM Blade Centers.

� Hot-swap capabilities:

The front bays on the BladeCenter unit are hot-swap blade bays; the rear bays on the BladeCenter unit are hot-swap module bays. You can add, remove, or replace blades or management, switch, power, or blower modules in hot-swap bays without removing power from the BladeCenter unit.

� Redundancy capabilities:

The redundant components in the rear of the BladeCenter unit enable continued operation if one of the components fails. Normally, the redundant power modules and blowers share the load. If one of the power modules or blowers fails, the non-failing power module or blower handles the entire load. You can then replace the failed blower or power module without shutting down the BladeCenter unit.

� Redundant network connection capabilities:

Configuring a pair of Ethernet switch modules in switch-module bays 1 and 2 identically provides support for Ethernet failover configured on blade servers. If blade server I/O expansion options can be configured for failover, configuring a pair of switch modules in switch-module bays 3 and 4 identically provides support for the failover configured on I/O expansion options.

Other network-interface I/O expansion options, such as the IBM HS20 Fibre Channel (FC) Expansion Card, that has similar capability for redundant SAN connections.

� System-management capabilities:

The BladeCenter unit comes with a system-management processor in the management module. This system-management processor, in conjunction with the system-management firmware that is provided with the BladeCenter unit and the system-management processor in each blade server, enables you to remotely manage the BladeCenter unit, its components, and the blade servers.


The management module also multiplexes the keyboard, mouse, and video ports and the USB port across the multiple blade servers.

The system-management processor in each blade server provides blade server system monitoring, event recording, and alert capability.

� Network environment support:

The BladeCenter unit supports a minimum of one four-port 1 Gb Ethernet switch module, expandable to two Ethernet switch modules. Each switch module provides one internal connection to each blade server, up to 14 internal connections per switch module.

The BladeCenter unit also supports two additional switch modules, for a total of four switch modules. The two additional switch modules support the network interface on the optional I/O expansion card installed on one or more blade servers in the BladeCenter unit.

Each of these two additional switch modules provides one internal connection to the optional I/O expansion card, up to 14 internal connections per switch module.

These BladeCenter features alone can reduce the cost of deployment, reprovisioning, updating, and troubleshooting. The cost savings come from the fact that modern computing environments are often made up of hundreds of servers. With that many systems, even simple infrastructure, such as network cabling, can become very expensive. Blade-based computing reduces the amount of infrastructure required to support large numbers of servers. By integrating resources and sharing key components, costs are reduced and RAS (reliability, availability, and serviceability) is increased.

It is RAS that makes up three of the most important features in server design. RAS features are found in unprecedented levels in the BladeCenter. These factors help to ensure the integrity of the data stored on the blades; that the blades are available when needed; and that should a failure occur, diagnosis and repair of the failure happens with minimal inconvenience.

The following list describes some of the RAS features that the BladeCenter unit supports:

� Shared key components, such as power, cooling, and I/O

� All components serviced from the front or rear of the chassis

� Automatic error retry and recovery

� Automatic restart after a power failure

� Built-in monitoring for blower, power, temperature, voltage, and for module redundancy

� Remote system management through the management module

� Remote management module firmware upgrade

� Remote upgrade of blade server system-management processor microcode

� Predictive Failure Analysis (PFA) alerts

� Redundant components:

– Cooling fans (blowers) with speed-sensing capability– Power modules

Note: The two additional switch modules must have the same I/O type, such as Fibre Channel, and must match the network interface on the optional I/O expansion cards in the blade servers.


� Hot-swap components:

– Cooling fans (blowers) with speed-sensing capability– Power modules– Management module– Switch modules– Blades– Media tray

� System automatic inventory at startup

� System error logging with error codes and messages

9.6.2 BladeCenter disk storageThe direct-attach storage capacity in blade-based computing solutions is limited by the very small nature of the blades themselves. This drawback has the potential to limit the applicability for blade-based computing. Fortunately BladeCenter provides an alternative: BladeCenter blades can easily attach to a Fibre Channel SAN via the FC I/O expansion card and FC switches. This ability is critical for implementing solutions that require access to shared disk, like VMware VMotion for example.

The disk storage options available for BladeCenter blades are described next.

HS20 blades� 2 x 2.5" IDE disks: These disks can be mirrored in firmware for resilience.

� 2 x U320 SCSI disks: These disks use the blade SCSI sidecar which has two standard IBM xSeries hot-swap SCSI drive bays and an LSI U320 SCSI controller capable of implementing hardware RAID levels 0 or 1

� 1 x Dual-Port Fibre Channel I/O expansion card: This card (in conjunction with the BladeCenter FC switch or OPM) allows a blade to connect to a SAN. The card takes the space required by the second IDE disk, reducing the number of possible IDE disks to one

Any mix and match of these disk configurations can be implemented for complete flexibility, providing that the IDE and FC I/O expansion card implementation requirements are followed.

HS40 blades� 2 x 2.5" IDE disks: These disks can be mirrored in firmware for resilience.

� 2 x U320 SCSI disks: These disks use the blade SCSI sidecar which has two standard IBM xSeries hot-swap SCSI drive bays and an LSI U320 SCSI controller capable of implementing hardware RAID levels 0 or 1

� 2 x Dual-Port Fibre Channel I/O expansion card: This card (in conjunction with the BladeCenter FC switch or OPM) allows a blade to connect to a SAN. The cards take the space required by the each IDE disk, so that if two cards are used, no IDE disks can be installed

Any mix and match of these disk configurations can be implemented for complete flexibility, providing that the IDE and FC I/O expansion card implementation requirements are followed.

IBM BladeCenter Fibre Channel switch moduleWhen the FC I/O expansion card is used, this switch module(s) is inserted into the rear of the BladeCenter chassis to enable the blades to connect to a SAN. One or two modules can be installed depending upon the need for redundancy in the fabric. See Figure 9-10 for details.


Figure 9-10 FC switch (illustrated) or OPM installation in a BladeCenter chassis

The IBM BladeCenter FC switch module features include these components:

� Ports:

– Two external ports to connect to storage devices or Storage Area Networks – Fourteen internal ports to connect to blade servers

� Protocols: Support various Fibre Channel protocols

� Scalability: Maximum 239 switches depending on configuration

� Maximum User Ports: 475,000 ports depending on configuration

� Media Type: Small Form Pluggable (SFP) hot pluggable, optical transceivers.

� Fabric Port Speed: 1.0625 or 2.125 Gigabits/second

� Maximum Frame Size: 2148 bytes (2112 byte payload)

� Fabric Point-to-Point Bandwidth: 212 or 424 MBs full duplex

� Fabric Aggregate Bandwidth: 64 MBs for a single switch

� Fibre Channel Cable Media: 9 micron single mode, 50 micron multi-mode, or 62.5 micron multi-mode

Brocade SAN switch module for IBM BladeCenterWhen the FC I/O expansion card is used, this switch module(s) is inserted into the rear of the BladeCenter chassis to enable the blades to connect to a SAN. One or two modules can be installed depending upon the need for redundancy in the fabric. See Figure 9-10 for details.

This recently announced FC SAN switch module for the BladeCenter enhances the value proposition of the BladeCenter particularly for environments where a Brocade fibre fabric already exists. Features of this switch include:

� Ports:

– Two external ports to connect to storage devices or Storage Area Networks – Fourteen internal ports to connect to blade servers

� Fabric switch delivering Brocade functions, including performance, manageability, scalability, and security to support demanding Storage Area Networks

ACDC

ACDC

ACDC

ACDC

BladeCenter chassis (rear view)

OK ? !

1FC2

log tx/rx !

OK ? !

1FC2

log tx/rx !

BladeCenterFibre ChannelSwitches or

OPM’s13

4 2

Switch module bays


� The Brocade Entry SAN Switch Module is ideal for customers that will integrate BladeCenter into small Brocade SANs while still providing the ability to grow as needed

� The Brocade Enterprise SAN Switch Module delivers full SAN fabric for large Storage Area Networks

� Fully upgradeable with Brocade's suite of Advanced Fabric Services

� Compatible with existing Brocade fabric, Fabric OS features, and Brocade SAN management tools

IBM BladeCenter Optical Passthrough Module (OPM)When the FC I/O expansion card is used, this passthrough module(s) is inserted into the rear of the BladeCenter chassis to enable the blades to connect to a SAN. One or two modules can be installed depending upon the need for redundancy in the fabric. Refer to Figure 9-10 on page 159 for details.

The OPM provides the ability to transmit and receive network data traffic between all (14) blade bays. Unlike the FC switches, this module simply breaks-out the blade network or I/O card ports into fourteen cables. The networking environments supported by the OPM are as follows:

Fibre Channel To enable Fibre channel, the OPM can be inserted into switch module bays 3 and 4

In order for the OPM to function in bays 3 and 4, the Fibre Channel Expansion Card is required on the blade server

An external FC switch with fibre ports is needed to connect the OPM and blades to the network. One port per blade is required.

Gb EthernetTo enable Ethernet, the OPM can be inserted into switch module bays 1, 2, 3 or 4

If the OPM is inserted into switch module bays 1 or 2, the OPM interfaces with the integrated dual Gb Ethernet controllers on the blade server

In order for the OPM to function in bays 3 and 4, the Gb Ethernet Expansion Card is required on the blade server

An external network switch with fibre ports is needed to connect the OPM and blades to the network. One port per blade is required.


Connecting to FAStTUsing FC Switches to connect to a FAStT gives the options presented in Figure 9-11 and Figure 9-12, depending upon whether there is an existing SAN fabric or not.

Figure 9-11 BladeCenter with FC switch modules direct to FAStT

ACDC

ACDC

ACDC

ACDC

BladeCenter

OK

?

!

1 FC 2

log

tx/rx

!

OK

?

!

1 FC 2

log

tx/rx

!

BladeCenterFibre Channel

Switches

OK ? !

1FC2

log tx/rx !

OK ? !

1FC2

log tx/rx !

FAStT


Figure 9-12 BladeCenter with FC switch modules and external FC switches (fabric) to FAStT

Because of the dramatic reduction and change in cabling when connecting a BladeCenter that may house up to fourteen blades attached to a FAStT, special consideration is needed. For example, when using IBM FC switches, the fabric must operate in interoperability mode for the BladeCenter switches to participate in the SAN.

For more details on IBM BladeCenter SAN interoperability, go to:

http://www.ibm.com/servers/eserver/bladecenter/literature/solutions_lit.html

Look for the IBM eServer BladeCenter SAN Interoperability Guide and the IBM eServer BladeCenter SAN Solutions Guide.

ACDC

ACDC

ACDC

ACDC

BladeCenter

OK

?

!

1 FC 2

log

tx/rx

!

OK

?

!

1 FC 2

log

tx/rx

!


Switches

OK ? !

1FC2

log tx/rx !

OK ? !

1FC2

log tx/rx !

Switch 2

Switch 1

FAStT


http://www.ibm.com/servers/eserver/bladecenter/literature/solutions_lit.html

Chapter 10. VMware ESX Server terminology, features, limitations, and tips

This chapter provides an overview of VMware terminology and features. It is not intended to replace the ESX Server 2.1 Administration Guide or ESX Server 2.1 Installation Guide, which we strongly advise to review before attempting any installation (you can find these guides on the installation CD). It will simply allow the reader to better follow and understand the steps outlined in the installation chapter. This chapter also contains some tips not documented in other publications.

10


10.1 Storage Management: naming conventions and featuresThe Storage Management feature can be launched by opening a Web browser and specifying the IP address of the ESX server as URL, then logging in and selecting the Options tab -> Storage Management...

There are three different selectable views: the Disks and LUNs view, the Failover Paths view, and the Adapter Bindings page.

10.1.1 Disks and LUNsThe Disks and LUNs view shows you all disks available for use with the virtual machines (either exclusively assigned or shared for use with the VMware kernel). You can use it to view disks and partitions and also to create or edit partitions on a disk.

Figure 10-1 shows an example of the Disks and LUNs view.

Figure 10-1 Disks and LUNs view

Note: If a disk adapter has been assigned to the Service Console only, the disks will not be visible in the Disks and LUNs view.

Also, IDE based disks (such as blade servers) cannot be used by the VMware kernel services. They will therefore not be displayed in this view either.


In the example in Figure 10-1 you can see three disks, called vmhba0:14:0, vmhba1:0:0, and vmhba1:0:1.

All disks visible by the VMware kernel are named:

vmhba[x]:[y]:[z]

Where:

� [x] is the number of the disk controller (as detected by VMware, depending on PCI bus priority),

� [y] is the SCSI target ID (for LUNs on the SAN, the target ID is the ID of the owning controller)

� [z] is the actual LUN number.

So disk vmhba0:14:0 is the first LUN (LUN 0) on the first disk controller with SCSI ID 14. In this case it is actually the internal hot swap physical disk (SCSI 14 assigned by the backplane) of a x360 connected to the integrated SCSI controller.

As you can see from the partition structure, this disk has been used for the installation of VMware ESX (three primary partitions plus two partitions in the extended partition).

Disks vmhba1:0:0 and vmhba1:0:1 are two logical drives on the FAStT already configured with a VMFS partition each.

PCI slots sequence and HBA IDsIn most installations, the ID of the HBA through which the LUN is seen first is the HBA with the lowest ID. However that i s not always necessarily the case. Which HBA sees the LUN first depends on:

� The order in which HBA drivers are loaded (if you have different types of HBAs)

� the order in which HBAs are discovered by a driver, that is, the order of HBAs on the PCI bus

On systems where all HBAs are managed by the same driver (for instance all QLogic 23xx), the order depends only on the position of the HBA on the PCI bus.

Tip: ESX Server can dynamically detect added LUNs; you do not have to reboot the server for that. If you for instance forgot to connect your SAN, simply connect it and click Rescan SAN and Refresh and it will show up.

Note: Using the vmkfstools -s vmhba[x] command (where [x] is the number of the vmhba) you can also issue a rescan from the console OS command line. With ESX Server 2.1, issue this sequence of commands (this example is for scanning vmhba1):

1. wwpn.pl -s

2. vmkfstools -s vmhba1

3. cos-rescan.sh vmhba1

Note: In a multi-HBA configuration (that is, where the LUN can be seen through multiple HBAs), the ID of the HBA adapter which sees the LUN first (that is, the HBA on the path where the LUN is discovered first) will be used.

Chapter 10. VMware ESX Server terminology, features, limitations, and tips 165

At installation time, HBAs are assigned IDs (for example, vmhba0) in increasing PCI slot numbers. So the HBA ID order and PCI slot order are the same, as shown in Example 10-1.

Example 10-1 Initial PCI slot and HBA sequence

# cat /etc/vmware/devnames.conf

001:06.0 nic vmnic0001:08.0 fc vmhba0001:08.1 fc vmhba1003:05.0 fc vmhba2004:08.0 nic vmnic2006:06.0 aic7xxx vmhba3006:06.1 aic7xxx vmhba4

If an HBA adapter is later added in slot 003:04:0, VMware ESX Server will assign it the next ID in sequence, that is, “vmhba5”. At that point the HBA ID order and PCI slot order will differ, as illustrated in Example 10-2.

Example 10-2 Altered PCI slot and HBA sequence

# cat /etc/vmware/devnames.conf

001:06.0 nic vmnic0001:08.0 fc vmhba0001:08.1 fc vmhba1003:04:0 fc vmhba5003:05.0 fc vmhba2004:08.0 nic vmnic2006:06.0 aic7xxx vmhba3006:06.1 aic7xxx vmhba4

10.1.2 Failover paths and failover policiesThe Failover Path window shows you all available SAN paths for each LUN. It also shows the status of each path and allows you to modify it (set path to active, disable it, and so on).

VMware ESX Server 2.1 has its own SAN path failover mechanism built into the VMware kernel. When ESX Server boots, it scans all devices on the SAN. If it finds multiple paths to a logical drive, it will automatically enable failover for it.

As the FAStT has the capability to present both controllers as possible owners for each LUN (even though the LUN is owned by only one controller at a time, also known as the preferred controller), ESX Server enables path failover between controllers automatically.

Tip: This FAStT feature is correctly called the Inquiry Non-Owned Logical Unit Device Type and specifies what value the controller will return for a LUN it does not own. It is not the same as the Auto Logical Drive Transfer (ADT) feature, which activates a FAStT internal failover mechanism.

Note: Because ESX Server has a built-in SAN failover functionality, additional failover mechanisms like RDAC (or SDD for ESS storage) must not be used.


Figure 10-2 shows an example of the Failover Path window. It has been taken on a server with a single HBA, connected via fibre switch (single zone) to both controllers of a FAStT.

Figure 10-2 Failover Path view - single HBA

The Failover Path window shows the available path(s) for each SAN based LUN. In the above example we can see the two LUNs (named vmhba1:0:0 and vmhba1:0:1 as explained in 10.1.1, “Disks and LUNs” on page 164). The top line summarizes the LUN name, the number of paths available (2) and the default policy for the failover, in this case set to fixed (for demo purposes only).

ESX Server 2.1 has two different failover policies with the following behavior:

� fixed — when available, use the preferred path

� mru (most recently used) — continue to use the active path (no failback)

Tip: Even though it is sometimes incorrectly stated that ESX Server 2.1 default policy is mru, actually it does not always default to the same policy.

It is actually a little more complicated than that. ESX Server 2.1 defaults to:

� fixed — for active/active devices (ESS) and active/passive devices (FAStT) with ADT (Auto Logical Drive Transfer) enabled

� mru — for active /passive devices (FAStT) with ADT disabled

Please note that the FAStT is considered an active/passive device, as a LUN is owned by only one controller (either A or B) at a time, even if both controllers are online.


To illustrate the path thrashing effect sometimes experienced with ESX Server when ADT is enabled, let’s assume a configuration as shown in Figure 10-3.

LUN1 is owned by controller A of the FAStT. HBA1 of each server therefore has to be the active one (as each HBA1 is connected to switch1 which is connected to the host-side mini-hub 1, thus to controller A). LUN1 is seen and used by both servers (for instance for VMotion).

Now assume a cable problem on the path between HBA1 (of ESX1) and the switch. The following events would happen:

� Server ESX1 would not be able to access LUN1 via HBA1.

� ESX Server will use its internal failover mechanism and fail over to HBA2.

� Server ESX1 tries to access LUN1 via its new active HBA, HBA2.

� HBA2 is connected via switch2 to controller B. As ADT is enabled, LUN1 will move automatically to controller B when HBA2 tries to access it via controller B.

� Server ESX2 (prior to 2.1) however has no awareness of this. As the default policy is set to fixed, ESX2 will continue to access LUN1 via HBA1 and therefore via controller A.

� Every time ESX2 makes an I/O request to LUN1 it will move to controller A, every time ESX2 tries to access LUN1 it will move to controller B, resulting in the LUN thrashing between both controllers and unpredictable results.

Important:

When using ESX Server 2.1 with FAStT, it is required to disable ADT (see also 12.4, “FAStT NVSRAM settings” on page 206).

Disabling ADT automatically results in the default failover policy to be mru, preventing path-thrashing (sometimes referred to as the ping-pong effect, as LUNs move back and forth between controllers). Note that path thrashing only happens in multi-initiator configurations (multiple servers access the LUN at the same time).

Please note that this configuration cannot be used for ESX Server 2.01. Starting with Release 2.1, ESX Server has an improved knowledge of the FAStT status codes. This allows ESX Server to take advantage of the disabled ADT mode to differentiate between the owning and the alternative controllers for a LUN. It is able to prevent path-thrashing situations by making all ESX nodes use the same controller as long as they all have a path to the same controller (similar but not the same as the RDAC functionality).


Figure 10-3 Sample configuration for path thrashing

Figure 10-2 on page 167 shows two paths to each LUN. As mentioned before, the screen was taken on a system with single HBA, connected via switch to both controllers. To determine what the two paths in our example represent, you need to understand the naming convention of the paths. The paths are actually identified by the name a LUN would have on this path, as explained in 10.1.1, “Disks and LUNs” on page 164.

Now look at the two paths for the first LUN. Based on the naming scheme vmhba[x]:[y]:[z], we see that only value [Y] is different. As mentioned, for SAN drives, the target ID is the ID of the controller. So the two paths we see are actually the paths to the two FAStT controllers. If LUN0 is owned by controller A, its name is vmhba1:0:0, if owned by controller B it is vmhba1:1:0.

You can also see from the picture in Figure 10-2 on page 167 that the active path for each is indicated by a green square (only one path can be active at a time for a specific LUN). In this case LUN0 is owned by controller A while LUN2 is owned by controller B.

You can use this screen to change settings of the paths as shown in Figure 10-4.


Figure 10-4 Changing the path settings

Again, the fixed policy is only used for demonstration purposes. When using mru there is no Preferred path option available (as always, the last active path will be used if available).

Figure 10-2 on page 167 shows two paths to each LUN. How many paths you actually see on your system depends entirely on your configuration. Number of HBAs and switch zoning determine the number of available paths.

Next we shown another path example in Figure 10-5.

Tip: We recommend to always use FAStT Storage Manager if you want to change ownership of LUNs between controllers (to achieve load balancing between FAStT controllers).

Thanks to improved communication between FAStT and ESX Server (from release 2.1), changing controller ownership in Storage Manager automatically adjusts the active path in VMware (seamlessly).

Also, attempting to change the active path in VMware (from the Failover Paths window) might not cause the LUN to move to the other controller. This is working as designed (specifically with ADT disabled).


Figure 10-5 Path example for dual path - crossover configuration

The example shown for illustration in Figure 10-5 is actually taken on a dual HBA system using the recommended cross-over cabling between switch and controller (4 cables) as described in 11.2, “Using redundant paths from the switches to the FAStT” on page 181.

Here you can see 4 paths to each LUN. Each adapter sees a LUN via each of the 2 controllers = 2 paths. The system has 2 HBAs installed = 4 paths in total for each LUN.

10.1.3 Adapter bindingsThe Adapter Bindings view displays the World Wide Port Names bound to each HBA in the system. When using FAStT, it essentially tells you which adapter can see which controller of the FAStT (as the controllers are the target for the HBA, not the individual LUNs); see Figure 10-6 for an example.


Figure 10-6 Adapter Bindings window

10.1.4 VMFS or raw disks?In some cases it might be useful (or required) to use raw disks instead of the VMFS file systems. When using a raw disk you basically assign the whole logical drive directly to the virtual machine.

Attention: ESX Server 2.1 uses persistent bindings; this means that bindings are retained after a rescan or reboot even if the devices are no longer present.

While this is particularly useful when using raw disks (as it can help to ensure mapping to the correct disk since raw disks do not use VMFS mapping), it can also be confusing and potentially cause problems when old (nonexisting) bindings are retained.

You can see this, for instance, after changing the path configuration (removing an adapter, changing your zoning, testing with different FAStT models etc.).

Tip: You can remove the persistent binding using the usr/sbin/pbind.pl script. However, just running the documented pbind.pl -D does not work because during the shutdown process, ESX Server saves the current bindings in /etc/pbindings, therefore recreating the file that pbind.pl -D just deleted.

A working procedure is to stop, set the VMware services not to start, reboot, delete the bindings, then restart the VMware service. Perform the following steps on the service console:

1. Log in and type: chkconfig vmware off.

2. Reboot the server (for example, init 6).

3. After reboot type /usr/sbin/pbind.pl -D.

4. Type chkconfig vmware on.

5. Type /etc/init.d/vmware start (or reboot).


Some (not all) of the advantages for each approach are shown in Table 10-1.

Table 10-1 Advantages and disadvantages: raw disks versus VMFS

We suggest that you use VMFS unless you have a specific reason not to do so.

10.1.5 Booting from SAN and use of NAS devicesWith booting from SAN, we refer to the capability of an operating system to be installed on a SAN based disk. It will therefore boot off the SAN disk, not the internal disk (also known as disk-less configuration or remote boot).

While there can be many reasons to do it (centralized backup, disaster recovery) there are also reasons why you don’t want to do it (added complexity, no local swap file etc.). At the time of writing, ESX Server 2.1 does not support this function.

With NAS devices we refer to disk resources based on Network Attached Storage (NAS) devices.

10.1.6 Direct attached storageWith direct attached storage we refer to configurations where the ESX Server is directly attached to the FAStT storage, without the use of a switch in-between.

Advantages Disadvantages

VMFS � Easier to manage - friendly names

� Better flexibility - ability to partition logical drives into customized partitions or use VMFS volume sets to merge them

� Advanced features - disk modes like undoable disks

� Some performance overhead

� Problem with ESX Server 2.1 preventing dual path configurations with split virtual clusters

Raw disk � Potentially better performance

� Data can be accessed by ESX and non-ESX systems

� Lose all advantages listed for VMFS

Restriction: ESX Server 2.1 (and previous versions) do not support booting from SAN. Please check with IBM support or VMware if you use a later version, as VMware plans to implement this feature in the future (no commitments are made for this, though!)

Note: This restriction only applies for the ESX Server operating system, not for the virtual machines. Virtual machines will typically reside on the SAN and therefore boot off the SAN disk.

Restriction: With ESX Server 2.1 (and previous versions) you cannot use NAS storage to either boot off it or even store virtual machines on it. The reason for this is that the VMware kernel does not have NFS or SMB capabilities.

You can, however, use it to store images or other data accessed through the console OS.


10.1.7 FAStT premium features: RVM, FlashCopy, VolumeCopyThe premium FAStT features — FlashCopy, VolumeCopy, and Remote Volume Mirroring — are firmware based functions, not operating system based. Therefore, the function itself works just the same, regardless of what type of hosts the FAStT is connected to.

There are however OS specific considerations when it comes to the actual implementation of flashcopied logical drives or remote mirrored drives into ESX Server.

Restriction: At the time of writing, attaching an ESX Server (ESX Server 2.1 and previous versions) directly to the FAStT storage (all models, including FAStT600) is not supported.

You will have to use a switch between the server and the FAStT.

The root for this requirement lies in ESX Server not supporting AL (arbitrated loop) attachments. When attaching a server directly to a FAStT, a point-to-point connection is established. However, with point-to-point, several of the same protocols are sent (as with AL).

While we have not seen any specific problems with direct attach during our testing, VMware does not currently support it and plans to perform additional testing in order to change this support statement in the future. Therefore check with IBM support or VMware if you have the requirement for such an implementation.

Note: In the context of writing this book, we have not actually tested any of these functions (except for a very basic successful Proof of Concept (POC) of the FlashCopy function).

Any of these features are subject to specific certification by IBM (or alternatively RPQ), so please verify with IBM support if the configuration you plan to implement is supported. At the time of writing, there was only RPQ based support for any of these features with ESX Server 2.0.1.

Tip: Here are some things to bear in mind when implementing advanced functions with VMware ESX Server:

� When using VMFS, the target drive (of your FlashCopy, RVM, or VolumeCopy) is still VMFS. That means only a server which can read VMFS will be able to use this data (such as for backup or other use). That means, for instance, that an inhouse Windows 2000 backup server doing point-in-time backups of FlashCopy drives of all Windows NT, 2000, and 2003 servers will not be able to back up your .dsk files in the same manner.

� When using raw disks with ESX Server, you need to be careful with the numbering of the remote copied disks. As the disk is not identified by the friendly VMFS name but by vmhba number, you need to ensure that the vmhba is the same on both servers or modify the VMware configuration file accordingly to reflect changes.


10.1.8 Network considerationsAs this is a storage related document, we will not cover the networking setup in any great detail (except for BladeCenter in 12.9, “Configuring the virtual network” on page 250). This section will only cover the basic approach and new terminology with ESX Server 2.1. Please review the ESX Server Administration Guide (on installation CD) for more details.

Virtual switchesVirtual machines use virtual network adapters you configure during the VM creation. These adapters are then presented to the guest OS as dedicated physical adapters. In reality, however, you will have many virtual machines sharing only a few physical adapters.

ESX Server 2.1 introduces a new abstracted network device called the virtual ethernet switch. When installing VMware, you will be asked to create at least one virtual switch. You will be presented with the panel shown in Figure 10-7, which has a graphical representation of the virtual switch function.

Figure 10-7 Virtual Ethernet Switch principle

Important: We strongly advise that you carefully plan the network configuration and implementation, specifically if you intend to use some of the advanced networking features of ESX Server such as VLANs or upstream switch failure detection. Implementations of this kind are complex and depend on your external network infrastructure, therefore they require extensive design and testing.

While VMware ESX Server provides these advanced features, it is ultimately the responsibility of the implementor to ensure that these features can be implemented and will provide the expected functionality and protection.


A virtual ethernet switch functions very similarly to a normal physical switch: You have to install a switch before you can connect all your network devices together.

The virtual switch is used to connect the virtual machines (or better, the virtual network adapters you configured for the virtual machines) to actual physical adapters in your server. A virtual switch has 32 logical ports to connect up to 32 physical and virtual adapters.

The first step of the virtual network setup is the creation of a virtual switch and naming the switch with a Network Label.

The function of the switch is then determined by how many of your physical network adapters you assign (connect) to the switch (also referred to as outbound adapters):

� No physical adapters attached: The switch is for interconnection of virtual machines only, also referred to as vmnet.

� Single physical adapter attached: Virtual machines have their virtual adapters connected to the LAN segment of this adapter, also referred to as vmnic.

� Attaching multiple physical adapters to the same switch automatically creates a bond (or network adapter pair) for load balancing and failover functionality.

Note: As the virtual switch has 32 virtual ports, you can share a single physical adapter between a maximum of 31 virtual machines (configured with 1 virtual network adapter each).

Important: Develop a meaningful Network Labels naming convention, as you will use Network Labels when assigning network adapters to your virtual machines.

This is especially important when using VMotion, as the same network label must exist on all servers where you want to move virtual machines (and be connected to the same physical subnet) for VMotion to work.


Check the window shown in Figure 10-8 for available adapters.

Figure 10-8 Available physical (outbound) network adapters

:

Tip: A typical configuration of an ESX 2.1 Server could have three physical network adapters, one 10/100 or Gbit adapter dedicated to the console OS (strongly recommended) and two Gbit adapters in a bond to be shared between the virtual machines.

If you use VMware VirtualCenter’s VMotion function, you should use one additional Gbit adapter dedicated for the VMotion traffic.

The actual network configuration will depend on your configuration. You might need more network adapters, for instance, if you require more than 32 virtual network adapters.

BladeCenter Specifics: As the HS20 blade only has two network ports, it is recommended (and supported) to configure both adapters in a bond and share for all traffic, achieving fault tolerance for all traffic. See 12.9, “Configuring the virtual network” on page 250 for more details


VMware Port Groups: VLANsVMware ESX Server 2.1 introduces the new function of using Port Groups (commonly known as VLANs). VLANs are used to securely separate networks as if they were isolated in a separate physical network segment. See Figure 10-9 for a graphical representation.

Figure 10-9 Port groups

You can find additional information at:

http://www.vmware.com/pdf/esx21_vlan.pdf

Attention: VLANs can add significant complexity to your ESX design and require careful planning and implementation testing. Only implement VLANs if they are really required. Ensure that the physical network infrastructure supports the required network functionality.


http://www.vmware.com/pdf/esx21_vlan.pdf

Chapter 11. VMware ESX Server storage configurations

There are many ways of implementing VMware ESX Servers that are attached to FAStT storage. Variants range from the number of HBAs/switches/paths that are available for an ESX Server, to multiple ESX Servers sharing access to logical drives on the FAStT.

This chapter describes some of the FAStT storage configurations that are possible and lists the configurations that were tested in Chapter 12, “Installing VMware ESX Server” on page 197.

11


11.1 IntroductionWe have aimed to list the possible configurations according to two main principles; the requirement for non-redundant and redundant pathing and the requirement for non-shared and shared access, including clustering. The configurations have therefore been broken down under the following headings:

1. Configurations using redundant paths from the switches to the FAStT

2. Configurations by implementation (redundant and non-redundant pathing):

– Single HBA with single switch– Multiple HBAs with multiple switches

These are simple configurations where advanced availability and management functions will not be implemented.

3. Configurations by function:

– Independent (non-shared) VMFS volumes

– Shared VMFS volumes

• High Availability (HA)• VMotion

– Clustering

• Local virtual machine cluster• Split virtual machine cluster• Physical/virtual machine (hybrid) cluster

These are more complex configurations where advanced availability and management functions will be implemented.

4. Zoning

5. BladeCenter specifics

The details about how the installation was configured and tested are referenced following the description and diagram given for each configuration. The configurations that were not tested were generally left out on the basis that they are extensions to a previous configuration. An example of this is where the configuration changes from a single ESX Server to multiple ESX Servers but each one remains attached to its own storage partition; therefore, the setup and test results would be identical in both cases.

Configuring according to a common base of settings allows for growth from one configuration to another with minimal impact. It is therefore recommended to review all the configurations with your growth plan in mind (as much as possible) so that best practices can be applied from the initial installation and will last through a final configuration as it develops over time.

This principle correlates with the installation and configuration details given in Chapter 12, “Installing VMware ESX Server” on page 197, whereby the settings that need to be made have been compiled into a common set for all configurations with additional minimal changes listed for specific configurations as required.

Note: All tested configurations utilize Fibre Channel switches. As previously mentioned, at the time of writing, VMware does not support direct attachment of an ESX Server to a FAStT system.


11.2 Using redundant paths from the switches to the FAStTWhile testing the configurations described in 11.3, “Configurations by implementation” on page 186 and 11.4, “Configurations by function” on page 189, we encountered a limitation when discovering LUNs that have been newly assigned to an ESX Server.

The problem lies in the combination of how the ESX Server discovers new available LUNs, and the fact that the FAStT shows that all LUNs exist behind both FAStT controllers. This FAStT function normally allows a host to determine that multiple paths exist for any given LUN; however, the ESX Server may not be able to discover a LUN if is not mapped through the HBA through which a LUN is seen first (see “PCI slots sequence and HBA IDs” on page 165). The ESX Server may even crash during this discovery.

11.2.1 Recommended configurationTo avoid the discovery issue just mentioned, you must configure (wire) additional paths from the HBAs to the FAStT controllers (A and B), thus allowing every HBA to see every controller. Figure 11-1 shows a conceptual view of this setup (it is a conceptual view because in practice, direct attachment (no FC switch) is not supported).

Figure 11-1 LUN discovery with multiple paths to each controller

Note: VMware plans to change this behavior with the next major version of ESX Server. Check with VMware support if you are using a later version than ESX Server 2.1

Note: Given this current situation, VMware recommends that a configuration with multiple paths between the HBAs and the controllers be implemented.

server

Reset

HBA1 HBA2

ESX Server

FAStT

LUN0

Controller A Controller B

New LUN1

Chapter 11. VMware ESX Server storage configurations 181

Figure 11-2 and Figure 11-3 illustrate the recommended cabling with two ESX Servers and two FC switches connecting to a FAStT900.

The configuration in Figure 11-2 allows each HBA to see both controllers but requires the purchase of extra (host side) mini-hubs for a FAStT700 or FAStT900 (the standard configurations come equipped with two host-side mini-hubs only). The FAStT600, however, has four host ports as standard and does not use mini-hubs, so this configuration can be implemented easily in that case. All FAStT storage servers will also require enough switch ports for the four connections.

Figure 11-2 Multiple paths using extra mini-hubs

With this configuration, the use of four mini-hubs for host connections also has the implication of disallowing the Remote Volume Mirroring function, as this function requires two dedicated mini-hubs.

Note: In the case of a FAStT700 or FAStT900, connection using the four ports of the default two mini-hubs is not supported, as the port login would change to arbitrated loop (AL).

Zone 1, Switch 2

server

Reset

HBA1

Zone 1, Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2

Storage Partitioning

LUN0

New LUN1


Although the configuration in Figure 11-3 was not tested for this redbook, it should allow each HBA to see both controllers through the implementation of an inter-switch link (ISL) This configuration can be implemented easily as long as enough switch ports exist, but does introduce additional complexity in SAN fabric management.

Figure 11-3 Multiple paths using an inter-switch link (ISL)

Zone 1, Switch 2

server

Reset

HBA1

Zone 1, Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2


LUN0

New LUN1

Inter-switch link (ISL)


Figure 11-4 The switch to FAStT configuration that was tested for the redbook

11.2.2 Workaround for LUN discoveryAgain, we strongly recommend that a configuration as shown in Figure 11-2 on page 182 and Figure 11-3 on page 183 be implemented. If for some reason this not feasible in your environment, or a configuration like the one shown next in Figure 11-5 already exists, use the possible workaround given in this section.

Note: Figure 11-4 shows the actual configuration that was used for our tests. It only uses one FC switch (with zoning). More information, including test results, can be found in Chapter 12, “Installing VMware ESX Server” on page 197.

server

Reset

HBA1

Zone/Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2


Zone/Switch 2

ESX1&

ESX2

Note: This workaround is included for completeness of information only; it is not meant as a substitute for the best practices presented under 11.2.1, “Recommended configuration” on page 181.


In Figure 11-5, when LUN1 is created and mapped in a FAStT storage partition, it will automatically be assigned to controller B of the FAStT (although this can be changed manually). The problem occurs when the Rescan SAN function (of ESX Server) is performed in the MUI and the LUN becomes visible via controller A. However, because it is unavailable via that controller (not the preferred controller), a time-out occurs.

Figure 11-5 LUN discovery in ESX Server

In order to work around the problem, follow this procedure:

1. Create a new logical drive (LUN).

2. Assign the logical drive to the controller that is attached to the HBA through which the LUN is seen first (HBA1 in Figure 11-5). Ensure that all ESX Servers are such that the HBA through which a LUN is seen first will talk to the same FAStT controller (see “PCI slots sequence and HBA IDs” on page 165). If this is not the case, the other servers will not be able to see the new LUN(s).

3. Map LUNs by configuring Storage Partitioning.

4. Use the Rescan SAN function on all ESX Servers to discover the new LUN(s). If this is not done, the other servers will not be able to see the new LUN(s). The server will time-out during rescan if step 2 has not been done; in the worst case the server may even freeze completely.

5. Change controller ownership of the logical drive if needed.

The following additional procedures may also be necessary:

6. If time-outs occur on any of the ESX Servers, move all LUN ownerships back to the FAStT controller attached to the HBA through which a LUN is seen first on all ESX Servers (Controller A in Figure 11-5).

7. If a new additional ESX Server is added to the host group, ensure that all LUNs are placed on the path/controller of the HBA through which a LUN is seen first before LUN discovery on the new ESX Server. If this is not done, this new server will not be able to see all LUN(s).

Note: This problem exists with both single and multiple ESX Server configurations connected to a FAStT.

server

Reset

HBA1 HBA2

ESX Server

FAStT

LUN0

Controller A Controller B

New LUN1


11.3 Configurations by implementation This section shows examples of configurations that are possible when using single (non-redundant pathing) and multiple (redundant pathing) HBAs in the ESX Server(s) that will attach to their own FAStT storage partitions.

These are common installation options where the SAN architecture determines the configuration, that is, whether multiple HBAs or Fibre Channel switches are used.

11.3.1 Single HBA with single switchWe first examine single HBA with single switch configurations.

Single server with single HBAThe configuration in Figure 11-6 shows a very basic setup whereby a single ESX Server is connected to the FAStT using a single HBA.

Figure 11-6 Single server with single HBA configuration sample

The setup instructions for this configuration can be found in 12.3.1, “Single path” on page 201.

server

Reset

HBA1

Single Zone/Switch

ESX1

ESX1



Multiple independent servers with single HBAThe configuration in Figure 11-7 shows multiple independent ESX Servers connected to the same FAStT, assuming there is no sharing of LUNs between the servers. In other words, the LUNs for ESX Server ESX1 and ESX2 are separated by mapping them to distinct storage partitions.

Figure 11-7 Multiple independent servers with single HBAs configuration sample

The setup instructions for this configuration can also be found in 12.3.1, “Single path” on page 201.

11.3.2 Multiple HBAs with multiple switches

server

Reset

HBA1

Zone/Switch 1

server

Reset

HBA1

Zone/Switch 2

ESX1 ESX2

ESX1

ESX2


Note: In our test we used only one switch and we implemented zoning to represent two switches.

For examples of dual switch configurations please see Figure 11-14 on page 195 and Figure 11-15 on page 195 in the section 11.5, “Zoning options” on page 194.


Single server with multiple HBAsThe configuration in Figure 11-8 shows a single ESX Server connected to the FAStT using dual HBAs for redundancy.

Figure 11-8 Single server with dual HBAs configuration sample

The setup instructions for this configuration can be found in 12.3.2, “Dual path: crossover” on page 202.

Multiple servers with multiple HBAsThe configuration in Figure 11-9 shows multiple independent ESX Servers connected to the same FAStT storage where there is no sharing of LUNs between the servers (this configuration could have more than just two hosts). Note that the allocated storage is not shared but is mapped to different storage partitions.

server

Reset

HBA1

Zone/Switch 1

HBA2

ESX1

ESX1


Zone/Switch 2


Figure 11-9 Multiple servers with multiple HBAs - configuration sample

The setup instructions for this configuration can also be found in section 12.3.2, “Dual path: crossover” on page 202.

11.4 Configurations by functionWe discuss here different configurations that are available when using multiple ESX Servers.

A VMFS volume can set either as:

� A VMFS volume that is visible by only one ESX Server host. We call this independent VMFS modules. When you have multi[ple ESX Servers, independent VMFS modules can set through LUN masking (partitioning). This type of configuration is rarely needed and not recommended. It might be implemented when there is a requirement to keep separate the different ESX Servers’ virtual machines. This would be the case, for example, where two companies or departments share a SAN infrastructure but need to retain their own servers/applications

� A VMFS volume that is visible by multiple ESX hosts. This is the default. This VMFS mode is called public VMFS.

� A VMFS volume that is visible by multiple ESX hosts and stores virtual disks (.dsk) for split virtual clustering. This VMFS mode is called shared VMFS.

server

Reset

HBA1

Zone/Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2

ESX1

ESX2


Zone/Switch 2


� Public VMFS might be implemented for the following reasons:

– High Availability (HA) using two (or more) ESX Servers with shared LUN(s) allowing for one ESX Server to take over the workload of the other ESX Server if needed. With public VMFS, virtual machines can be run on either host server ensuring continuous application availability in case of a hardware failure on one of the ESX Servers, or if scheduled maintenance is required for one of the ESX Servers.

This remapping of virtual machines to a different ESX Server host can be achieved through several different manual or automatic (scripted) methods. It could for instance entail either the creation of a new virtual machine configuration pointing to the existing virtual machine .dsk file or simply registering a replicated virtual machine .vmx config file, from the original virtual machine, on the other ESX Server.

This is possible, as multiple ESX servers have access to the same VMFS volumes and a virtual machine can be started from potentially any ESX Server host (although not at the same time). It is important to understand that this approach does not protect against .dsk file corruption or failures in the storage subsystem unless the .dsk file is in some form replicated elsewhere.

– VMotion is a specific extension to the HA configuration. Indeed, VMotion allows a running virtual machine to be migrated from one ESX Server to another without being taken offline. Like above, in scenarios where an ESX Server needs to be taken down for maintenance, then the virtual machines can be moved without being shut down and while they are receiving workload requests.

– Clustering is another specific extension to the HA configuration that increases the availability of the environment. Not only can workload be transferred with minimal interruption during maintenance, but near continuous application availability can be achieved in case of an OS crash or hardware failure depending upon which of the following configurations is implemented:

• Local virtual machine cluster increases availability of the OS and application. Many server failures relate to software failure; implementing this configuration can thus help reduce software downtime. This configuration does not however increase hardware availability, and this may need to be taken into account when designing the solution.

• Split virtual machine cluster increases availability of the OS, application, and ESX Server hardware by splitting the cluster nodes across two ESX Servers. In the event of OS or ESX Server hardware failure, the application can failover to the surviving ESX Server/virtual machine cluster node.

• Physical/virtual machine (hybrid) cluster increases availability of the OS, application, and server hardware where one node is a dedicated physical server (non-ESX), and the other node is a virtual machine. Implementations of this kind are likely to occur where the active node of the cluster requires the power of a dedicated physical server (that is, four or more processors, or more than 3.6 GB memory), but where the failover node can be of a lesser power, yet remains for availability purposes.

The physical/virtual machine (hybrid) cluster may also be implemented where there are a number of dedicated physical servers as active nodes of multiple clusters failing over to their passive cluster nodes that all exist as virtual machines on a single ESX Server. As is it unlikely that all active nodes will fail simultaneously, the ESX Server may only need to take up the workload of one cluster node at a time, thus reducing the expense of replicating multiple cluster nodes on dedicated physical servers.

However, the physical server (that is, not the ESX Server) can only have a non-redundant SAN connection (a single HBA and a single FAStT controller); therefore, we do not actively advocate the use of this solution.


11.4.1 Independent VMFS volumesThe setup for these configurations will be same as for the configurations in 11.3.1, “Single HBA with single switch” on page 186 and 11.3.2, “Multiple HBAs with multiple switches” on page 187.

11.4.2 Public VMFS volumesThe examples in the following sections show the configuration options available when multiple ESX Servers attach to shared storage partitions.

High Availability (HA)The configuration in Figure 11-10 shows multiple ESX Servers connected to the same FAStT having a logical drive (LUN) shared between the servers (this configuration could have more than just two ESX Servers). Note that in contrast to the section, “Multiple servers with multiple HBAs” on page 188, the allocated storage is shared so that both ESX Servers access the same storage partition.

Figure 11-10 Multiple servers sharing a storage partition configuration sample

The setup instructions for this configuration can be found in “Configurations with LUN sharing” on page 211.

server

Reset

HBA1

Zone/Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2


Zone/Switch 2

ESX1&

ESX2


VMotionThe configuration for VMotion will function the same as the configuration in the preceding High Availability (HA) section, so the setup instructions for this configuration can also be found in the section “Configurations with LUN sharing” on page 211.

11.4.3 Clustering

As discussed in the introduction to section 11.4, “Configurations by function” on page 189, there are a number of different ways to implement MSCS with VMware ESX Server depending upon the level of requirements for high-availability and whether physical servers are included in the mix.

The following sections look at the different ways that MSCS might be implemented.

Local virtual machine cluster

Figure 11-11 Local virtual machine cluster

In the configuration in Figure 11-11, VMFS volumes are used with the access mode set to public for all the virtual machine disks.

Note: Clustering is currently only supported by VMware using Microsoft Clustering Services (MSCS) on Windows guests, and only in a two-node per cluster configuration.

server

Reset

HBA1

Zone/Switch 1

HBA2

ESX1

ESX1VMFS


Zone/Switch 2

Virtual Machines

Node1 Node2

MSCS

Node1 C:\ Node2 C:\

Q:\ Cluster Quorum

S:\ Cluster Share

.dsk Files


Split virtual machine cluster

Figure 11-12 Split virtual machine cluster

In the configuration in Figure 11-12, VMFS volumes are used with the access mode set to public for all virtual machine .dsk files (OS boot disks), and raw volumes used for the cluster shares. The cluster shares could be .dsk files on shared VMFS volumes, but limitations make the use of raw volumes easier to implement.

Physical/virtual machine (hybrid) clusterIn this configuration, one or multiple physical machine cluster nodes are clustered with multiple virtual machine cluster nodes.

Note: In this configuration, you cannot use VMFS partitions as the physical server (for instance, running Windows 2003 with Microsoft Cluster Service (MSCS) cannot read VMFS partitions, as it does not have a file system driver for this VMware file system).

Therefore, you must use raw disks.

Restriction: The physical server can only have one HBA, no multipath driver, and there can only be one controller in the FAStT. Therefore, we do not recommend this configuration.

server

Reset

HBA1

Zone/Switch 1

HBA2

Zone/Switch 2

ESX1


Q:\ Cluster Quorum

Virtual Machine

Node1

MSCS

Node1 C:\

Node2 C:\

.dsk Files

server

Reset

HBA1 HBA2

ESX2 Virtual Machine

Node2

ESX1&

ESX2VMFSS:\

Cluster Share


11.5 Zoning optionsZoning for an ESX environment is essentially not different than for a non-ESX environment. It is considered good practice to separate the traffic for stability and management reasons.

Zoning will follow your standard practice where, in reality, it is likely that multiple servers with different architectures (and potentially different cable configurations) will be attached to the same FAStT. In this case additional hosts would be added to the appropriate existing zone(s), or separate zones would be created for each host.

Three examples are given in Figure 11-13, Figure 11-14, and Figure 11-15 to demonstrate that using a single or multiple switch configuration gives different options. But, essentially, the question is whether to put all ESX Servers into one zone, or isolate the ESX Servers into their own zones.Single switch with multiple zones

Figure 11-13 Single switch with multiple zones

server

Reset

HBA1

Zone1, Switch 1

HBA2

server

Reset

HBA1 HBA2

Zone2, Switch 1

ESX1 ESX2


ESX1

ESX2


� Multiple switches with single zones:

Figure 11-14 Multiple switches with single zones

� Multiple switches with multiple zones:

Figure 11-15 Multiple switches with multiple zones

Zone 1, Switch 2

server

Reset

HBA1

Zone 1, Switch 1

HBA2

server

Reset

HBA1 HBA2

ESX1 ESX2


ESX1

ESX2

Zone 1, Switch 2

Zone 2, Switch 2

server

Reset

HBA1

Zone 1, Switch 1

HBA2

server

Reset

HBA1 HBA2

Zone 2, Switch 1

ESX1 ESX2


ESX1

ESX2


11.6 BladeCenter specificsGenerally speaking, the configuration of a BladeCenter will follow the same design principles as a non-BladeCenter environment. However, there are differences that require attention, such as the fabric implications mentioned in “IBM eServer BladeCenter and ESX Server” on page 155.

The following section deals with attaching a BladeCenter to a FAStT storage server.

11.6.1 The configuration tested in this redbookThe configuration in Figure 11-16 shows a BladeCenter environment where multiple independent ESX Servers are connected to the a FAStT storage and there is no sharing of LUNs between the servers (this configuration could have more than just two hosts and LUNs). The main purpose of this diagram is to show how the BladeCenter FC switches were used in our testing.

.

Figure 11-16 BladeCenter configuration sample

The setup instructions for this configuration can be found in 9.6, “IBM eServer BladeCenter and ESX Server” on page 155.

ACDC

ACDC

ACDC

ACDC

BladeCenter


ESX1

ESX2

OK

?

!

1 FC 2

log

tx/r

x !

OK

?

!

1 FC 2

log

tx/r

x !


Switches

OK ? !

1FC2

log tx/rx !

OK ? !

1FC2

log tx/rx !


Chapter 12. Installing VMware ESX Server

This chapter describes the steps required to configure the hardware and install VMware ESX Server on machines connected to a FAStT Storage Server.

Please follow systematically through all steps to ensure that all required settings for hardware and software are applied correctly.

12


12.1 Assumptions and requirementsThe instructions in the following chapter assume that you have done the following tasks:

� Reviewed Chapter 11., “VMware ESX Server storage configurations” on page 179 and decided which cabling and zoning configuration you want to implement.

� Physically installed all servers and storage.

� Installed the latest supported version of Storage Manager and can communicate with the FAStT.

� Updated your server including all installed options, FAStT, and fibre switch to the latest supported BIOS and firmware levels (including the Management Module (MM), Ethernet Switch, and FC Switch Module (FCSM) if you are using a BladeCenter).

� Obtained all required software, including ESX Server 2.1 and all required license keys.

� Set up a management system to remotely manage the ESX server (with a supported version of a Web browser, for example, Internet Explorer 5.5 or higher).

The settings used in this chapter represent a common group of settings that work with all configurations tested and described in this book. Some settings might not be required for certain configurations. Nevertheless, we strongly advise that you implement all the settings described regardless of the specific configuration you implement. Doing so will ensure your environment is prepared for the potential future addition of systems with different configurations.

Tip:

� The FAStT Storage interoperability matrix can be found at the following URL:


� The xSeries and BladeCenter compatibility lists can be found at:

http://www.pc.ibm.com/us/compat/

� The latest firmware for xSeries servers can be found by following the server link on the Driver Matrix site:

http://www.ibm.com/pc/support/site.wss/document.do?lndocid=DRVR-MATRIX

Note: All testing has been done with the GA version of ESX Server 2.1 (build 7728). The described settings are valid for ESX Server 2.1 and 2.1.1 only. Please verify the instructions with IBM support or VMware if you use a later version (The settings described in this document will not work with ESX Server 2.01.)

At the time of writing, the release of ESX Server 2.1.1. was imminent. ESX Server 2.1.1 is only considered a maintenance release of 2.1, and all the instructions in this document are therefore completely applicable to 2.1.1.

Important: While we have tried to keep the installation instructions as generic as possible, there are some specifics when installing blade servers compared to other xSeries servers.

We will point out the differences in boxes marked BladeCenter Specifics.


http://www.ibm.com/pc/support/site.wss/document.do?lndocid=DRVR-MATRIX


http://www.pc.ibm.com/us/compat/

12.2 HBA configurationYou begin the installation procedure by configuring the HBA(s) for use with VMware ESX Server.

You need to perform these actions on every server before installing VMware.

Ensure that each adapter on the server has been flashed with the latest BIOS (refer to the BIOS readme.txt for details).

After updating the Qlogic code, perform the following steps to change the HBA settings to the optimal VMware settings:

� Boot your server.

� On initialization of the Qlogic, BIOS press Ctrl+Q to enter the Fast!UTIL setup program.

� Select the first adapter (this screen is presented only when multiple adapters are present), as shown in Figure 12-1.

Figure 12-1 Host adapter selection in the Fast!UTIL setup program

� Select Configuration Settings → Restore Default Settings, then press Enter (Figure 12-2).

Figure 12-2 Restoring the default settings

� Select Configuration Settings → Host Adapter Settings and write down the Adapter Port Name (or WWN) (Figure 12-3).

Tip: Record the Adapter Port Name or WWN; you will need it when setting up storage partitioning in Storage Manager.

Chapter 12. Installing VMware ESX Server 199

Figure 12-3 Displaying the Adapter Port Name

� Select Configuration Settings → Advanced Adapter Settings.

� Verify that the following parameters are set correctly:

� Repeat all the above steps for the second adapter (if installed).

� Save your changes.

� Do not reboot your server at this time, remain in the Fast!UTIL setup program.

Your HBAs are now correctly configured for use with VMware. Proceed with 12.3, “Fibre switch configuration” on page 200.

12.3 Fibre switch configurationThe switch configuration depends upon the level of redundancy you have chosen for your configuration.

The following sections cover three main switch configurations:

� If your ESX server has only a single HBA installed, please continue with 12.3.1, “Single path” on page 201

� If your server has a dual HBA configuration, you should implement redundant crossover cabling (4 cables) between your switch(es) and the FAStT as discussed in Section 11.2, “Using redundant paths from the switches to the FAStT” on page 181.

Follow the instructions in 12.3.2, “Dual path: crossover” on page 202 for this configuration.

Configuring the switch will differ slightly depending on what switch (such as external Brocade versus internal Qlogic BladeCenter switch) and relevant firmware you use; the basic approach, however, will remain the same.

Enable LIP Reset: No

Enable LIP Full Login: Yes

Enable Target Reset: Yes

� Tip: Refer to Chapter 11., “VMware ESX Server storage configurations” on page 179 for help in deciding which configuration is the appropriate one for your needs.


Before you proceed, verify that your switches are running the latest supported version of their Firmware.

12.3.1 Single pathFor single path configurations, a single switch is typically used. Zone your switch so that the HBA has access to both FAStT controllers (A and B). When attaching a single ESX server, you can simply create one zone for all ports.

Our examples in Figure 12-4 and Figure 12-5 show the zoning of a Brocade switch for the attachment of a single ESX server with single path configuration to the FAStT. Ports 0 and 1 are used for the FAStT controllers, while port 10 connects to the HBA of the server.

Figure 12-4 Switch example - single path

Figure 12-5 shows an example of a valid switch zoning for this configuration. All three switch ports are members of the same zone.

Figure 12-5 Zoning example - single path

BladeCenter specifics:

If you configure an IBM BladeCenter with the integrated Qlogic FCSM, please make sure you review 12.3.3, “Zoning the Qlogic FCSM of a BladeCenter” on page 203.

If your BladeCenter is configured with the Brocade Enterprise San Switch Module (BESSM), the management interface will be similar to the one for the external Brocade switches.

Restriction: If you are connecting the BladeCenter FCSM (Qlogic switch module) to an additional external SAN Switch (for instance Brocade) you will have to run the switches in interoperability mode, meaning that you cannot do port level zoning.


Please ensure that your configuration is enabled before you quit the switch configuration application. The status of your configuration must show up as enabled in the top right hand corner of the screen, as visible in Figure 12-5.

If your configuration consists of multiple independent servers, refer to “Considerations for attaching multiple independent hosts” on page 211, otherwise continue with 12.4, “FAStT NVSRAM settings” on page 206.

12.3.2 Dual path: crossoverIt is recommended to use a dual switch configuration, for redundancy. For simplicity in our tests, we used a single (zoned) switch as illustrated before, in Figure 11-8 on page 188.

In this case, you can see that Ports 0 and 1 are used for cabling a first path to the FAStT controllers (A+B), port 4 and 5 are used for the second path to the FAStT (again to A+B), while port 10 and 14 are used to connect the HBAs of the server (Figure 12-6).

Figure 12-6 Switch example - dual path crossover

Figure 12-7 shows an example of a valid switch zoning for this configuration. Each HBA can see both controllers through a different pair of cabling, HBA1 through cable pair 0/1 and HBA2 through cable pair 4/5.

Figure 12-7 Zoning example - dual path crossover

If you use dual switches, then you would simply create one zone on each switch (for example, Zone_A would be created on switch 1 and Zone_B would be created on switch 2.

Please ensure that your configuration is enabled before you quit the switch configuration application. In the example in Figure 12-7, it must show up in the top right hand corner as enabled.


If your configuration consists of multiple independent servers, review 12.3.4, “Considerations for attaching multiple hosts” on page 206; otherwise proceed with 12.4, “FAStT NVSRAM settings” on page 206.

12.3.3 Zoning the Qlogic FCSM of a BladeCenterEven though the general approach outlined in 12.3.1, “Single path” on page 201 and 12.3.2, “Dual path: crossover” on page 202 is applicable to any configuration (and switch model), we’ll give a sample zoning configuration for the integrated Qlogic Fibre Channel Switch Module (FCSM).

In this example we have configured the BladeCenter with 2 blade servers (each configured with 1 IDE drive and a dual port fibre daughter card) using two FCSM in the BladeCenter chassis.

� Open the IBM BladeCenter SAN utility

� Select Open existing fabric, and you will be presented with a screen as shown in Figure 12-8.

You need to add both FCSMs.

Figure 12-8 Adding a fabric

� Enter the IP address of the first switch, specify Login Name and Password (defaults: Admin/password) and click Add Fabric.

� Add the second switch by selecting Fabric → Add Fabric.

You should now see a screen similar to Figure 12-9.

Attention: You need to download and install the IBM BladeCenter SAN Utility to manage the FCSM.

Note: Ports 0 and 15 are used for the external switch connections. Port 0 is actually the bottom connector on the switch module.


Figure 12-9 BladeCenter SAN Utility - Main view

Now configure your zones as required; remember that you need to configure each switch individually. For a valid dual path sample configuration (with cross-over connections to the FAStT) see Figure 12-10 and Figure 12-11.

� Select the first switch and click Zoning → Edit Zoning, and set up your zones on FCSM as required, see Figure 12-10 for a sample zoning of FCSM1.

Figure 12-10 Zoning sample of FCSM1


� Repeat the zoning related steps for the second FCSM, see Figure 12-11 for an example.

Figure 12-11 Zoning sample of FCSM2

Figure 12-12 Activate zoning

Attention: To make the zoning effective, you will have to activate the zoning on each of the two switches by selecting Activate Zone Set... as shown in Figure 12-12.


12.3.4 Considerations for attaching multiple hostsIn reality, you are likely to attach multiple hosts (potentially with different cable configurations) to the same FAStT. In this case you can either join each additional host in the appropriate existing zone or, if you are required to separate traffic for each host, you could create a separate zone for each host. See Section 11.5, “Zoning options” on page 194 for further details and examples.

12.4 FAStT NVSRAM settingsThis section describes what NVSRAM settings are required for use with VMware ESX.

NVSRAM settings are used on the FAStT to set certain default settings for the storage behavior (for example, how the storage is exactly presented to the host, which features are enabled, etc.). These settings typically differ depending on the type of operating system running on the host. In order to have OS specific options, the settings for a particular server are determined by the host type specified during the storage partitioning setup; see 12.5, “LUN configuration and storage partitioning” on page 207.

The following settings are required when connecting ESX 2.1 systems to FAStT storage:

� Disabled ADT (see 10.1.2, “Failover paths and failover policies” on page 166 for details.

� Enabled Propagated Host Bus Resets (used mainly for multi-node configurations).

� Enabled Allow Reservation On Unowned LUNs.

The host type representing the above settings correctly is LNXCL, not Linux.

Attention: At the time of writing (Storage Manager v 8.40/8.41), a VMware specific host type was not available. With the current FAStT architecture, only 16 possible host types are available and all of them are already used. In order to add a new host type, an existing host type definition would have to be removed. Enhancement requests to increase the number of possible host types have been made, but no time frame for a change has been committed.

Important: We strongly recommend that you use host type LNXCL when connecting ESX 2.1 hosts (see Figure 12-13 as an example).

� In the past, host type Linux was typically used, but ESX 2.1 now requires settings that were not required with 2.01 (such as ADT disabled); ESX 2.1 Server communication with FAStT has been enhanced to retrieve and use these FAStT settings.

� In theory, you can use host type Linux, but you would have to manually modify the default NVSRAM settings using FAStT NVSRAM scripts. This would also mean that every time you reload the NVSRAM you will have to reapply these scripts.

� As the only differences between LNXCL and Linux are the above three settings, we strongly recommend the use of LNXCL for all ESX implementations using ESX 2.1 (regardless whether for standalone, VMotion, or clustering).


Figure 12-13 Use LNXCL as host type

12.5 LUN configuration and storage partitioningRefer to 3.3.2, “Creating arrays and logical drives” on page 72 for general guidance on how to define arrays, create LUNs and define storage partitions

In summary:

� Open the Storage Manager Client application.

� Connect to your FAStT storage.

� Use Logical/Physical View to configure the required arrays and logical drives.

� Use the default settings unless the application you plan to run in the virtual machine requires any specific settings.

� Make sure LUNs are distributed across controller A and B, to achieve proper load balancing.

Figure 12-14 is an example of a LUN configuration with two logical drives as we defined it for our tests. One drive is assigned to controller A and the other to controller B to achieve load balancing across controllers. For practical sizing guidelines or limitations, refer to Chapter 9., “Introduction to VMware” on page 141.

Note: The Storage Manager client cannot be installed on the ESX Server, but rather on a Linux or Windows management workstation; it can be the same management workstation used for the browser based VMware Management Interface.


Figure 12-14 LUN creation with Storage Manager

After you created all required logical drives, proceed with storage partitioning.

How you set up your storage partitions depends on the functionality you plan to implement:

� If your configuration does not require LUN sharing (independent ESX servers, local virtual cluster) go to 12.5.1, “Configurations without LUN sharing” on page 209.

� If your configuration requires LUN sharing (VMotion, VMFS sharing or split virtual cluster) continue with the instructions 12.5.2, “Configurations with LUN sharing” on page 211.


If you are installing ESX Server on a blade server with IDE drives, you will not be able to install the swap file locally, as IDE drives cannot host VMFS file systems. You have to set up the swap file (as well as the vmdump file) on a VMFS partition on the SAN (you could also use the optional SCSI drives for the swap file, but that would cut your server density in half — we will therefore use the SAN for it).

This is officially supported by IBM and VMware.

We therefore recommend to create a separate logical drive for the swap file of each blade server. The size should at least be equal to the amount of physical memory + 100MB (for the VMware core dump) and the drive should be mapped in Storage Manager to this server only. See Figure 12-15 for an example of storage partitioning such a configuration.

Tip: A typical configuration can also contain a combination of shared and un-shared logical drives. For example, you might want to use several logical drives for VMotion (shared) but still have other dedicated logical drives with data specific to a host (non-shared). In this case, use a combination of the two approaches.

See Figure 12-15 as an example where the logical drives used for the ESX Server swap file (BladeCenter IDE configuration) are assigned to the individual hosts (non-shared) and the logical drives intended for Vmotion are assigned to the host group (therefore shared).


Figure 12-15 Example of combined shared and unshared logical drives

12.5.1 Configurations without LUN sharingFor these configurations, no logical drives will be shared between ESX servers. Therefore each logical drive must be either mapped directly to a host or to a host group with a single host as a member.

� Select the Mappings View in the Storage Manager client application.

� Create the required host group(s) and host(s) with the relevant host port(s) (HBA).

Tip: If you cannot seem to seem to get the Host port identifier (the WWN of your HBA) to show up in the Define Host Port window, try the following technique:

� Reboot your server and enter the Qlogic BIOS by pressing Ctrl+Q on initialization (unless you are still in the Qlogic BIOS setup screen as instructed in 12.2, “HBA configuration” on page 199)

� Select Scan Fibre Devices from within the Qlogic BIOS as shown in Figure 12-16.

Figure 12-16 Scan Fibre Devices

If the host port identifier still doesn’t show, you can also enter it manually; otherwise verify switch zoning and cabling.


� Select LNXCL as host port type for every ESX host port as shown in Figure 12-17.

Figure 12-17 Select host type LNXCL for the ESX host ports

� Map each drive to its respective host.

The example in Figure 12-18 shows a host group with a single host as a member and a host port created for each physical HBA in the server (in this case, two). Two logical drives are mapped to this host group. Mapping it to the host directly would have the same effect.

Figure 12-18 Storage partitioning - single server / dual path attachment without LUN sharing

Attention: At the time of writing, there was no VMware specific host port type available. It is therefore important that you select LNXCL as host port type.

Important: Do not map an access LUN to any of the ESX Server hosts or host groups.


If your configuration consists of multiple independent servers, review “Considerations for attaching multiple independent hosts” on page 211. Otherwise, you have finished the preparation of your hardware and you can proceed with 12.7, “ESX Server installation” on page 215.

Considerations for attaching multiple independent hostsThe basic storage partitioning approach remains since no logical drives will be shared between ESX servers. Therefore each logical drive must be either mapped directly to a host or to a host group with a single host as member.

Figure 12-19 shows an example when attaching multiple (two) independent hosts with dual path configurations without LUN sharing.

Figure 12-19 Multiple independent servers / dual path attachment without LUN sharing

In this example we mapped one logical drive directly to each host.

You have now finished the preparation of your hardware and you can continue with 12.7, “ESX Server installation” on page 215.

12.5.2 Configurations with LUN sharing For these configurations, logical drives will be shared between ESX servers. Therefore each shared logical drive must be mapped to a host group containing all hosts needing to access the logical drive(s):

� Select the Mappings View in the Storage Manager Client application.

� Create the required host group(s) and host(s) with the relevant host port(s) (HBA).



� Select LNXCL as host port type for every ESX host port as shown in Figure 12-17 on page 210.

Figure 12-21 Select host type LNXCL for the ESX host ports

� Map each drive to its respective host group (see Figure 12-22).

Tip: If you cannot get the Host port identifier (the WWN of your HBA) to show up in the Define Host Port window, try the following technique:

� Reboot your server and enter the Qlogic BIOS by pressing Ctrl+Q on initialization (unless you are still in the Qlogic BIOS setup screen as instructed at the end of 12.2, “HBA configuration” on page 199).

� Select Scan Fibre Devices from the Fast!UTIL options menu (see Figure 12-20).

If the host port identifier still doesn’t show up, you can also enter it manually, otherwise verify switch zoning and cabling.

Attention: At the time of writing, there was no VMware specific host port type available; it is therefore important that you select LNXCL as host port type (see Figure 12-21).

Important: Do not map an access LUN to any of the ESX hosts or host groups.


Figure 12-22 Storage partitioning - two servers with dual path attachment and LUN sharing

This example shows a host group with two hosts as members and host ports created for each physical HBA in the server (in this case two). Two logical drives are mapped to this host group.

You have finished the preparation of your hardware and you can continue with 12.7, “ESX Server installation” on page 215, which describes the common installation steps.

12.6 Verifying the storage setupThe following section will help you verify that your storage setup is fundamentally correct and you can see the FAStT storage:

� Boot your server.

� On initialization of the Qlogic BIOS, press Ctrl + Q to enter the Fast!UTIL setup program.

� Select the first adapter (only an option when multiple adapters are installed).

Note: Mapping the logical drive to either of the hosts directly would disable the ability to share the logical drive — it has to be mapped to the host group instead.


� Select → Host Adapter Settings → Enter (Figure 12-23).

Figure 12-23 Host adapter selection in the Fast!UTIL setup program

� Select → Scan Fibre Devices → Enter (Figure 12-24).


When running the Scan Fibre Device routine, an entry for the FAStT Controller should be displayed, similar to what is shown in Figure 12-25.

Figure 12-25 Result of the Scan Fibre Device showing the FAStT controller

Depending on the cabling scheme implemented, you might see multiple instances.

If you cannot see any FAStT controller, you will need to verify the cabling, switch zoning, and FAStT LUN mapping.

� Exit the setup program.


12.7 ESX Server installationIn this document, we focus on the local installation process for the ESX 2.1 Server software on servers connected to FAStT storage. There are other ways to install ESX in a more automated fashion, either using the remote and scripted capabilities of ESX Server itself or using deployment tools like the IBM Remote Deployment Manager (RDM) (at the time of writing, not officially supported yet for use with ESX).

With ESX Server 2.1, a graphical installation interface, referred to as the Graphical Installer, is available for the first time. You can still use the text mode or Text Installer if you want or need to. However, the Graphical Installer requires fewer reboots (only one) and we thus recommend to use the Graphical Installer, if you can.

ESX Server will automatically attempt to start the Graphical Installer mode and will switch to text mode if it cannot:

� If you install ESX Server on HS20/HS40 blades connected to a chassis that does not have a USB mouse, go to 12.7.1, “Text mode installation of ESX Server (BladeCenter without USB mouse)” on page 215.

� If you install ESX Server on any supported xSeries server (such as x445, x365) or on a HS20/HS40 blade server connected to a BladeCenter chassis with external USB mouse, go to Section 12.7.2, “Installation of ESX using the Graphical Installer” on page 230.

12.7.1 Text mode installation of ESX Server (BladeCenter without USB mouse)You can install ESX Server by either selecting direct keyboard, video and mouse (KVM) for the appropriate blade, or by using remote control through the BladeCenter Management Module (MM). In our case we used remote control via the MM.

Restriction: When installing ESX Server 2.1 on a BladeCenter, you cannot use the Graphical Installer unless you use an additional USB mouse connected to the USB port on the front of the BladeCenter chassis. This is because ESX Server 2.1 cannot detect the native mouse connected to the BladeCenter chassis. To summarize:

� BladeCenter with standard mouse: Text mode installation only.

� BladeCenter with additional USB mouse: Graphical Installer (recommended) or text mode.

Due to the above restrictions, we recommend to order a USB mouse if you plan several local installations on blade servers.

Important:

� In the writing of this book, all testing has been done with HS20 blades only.

� The following installation procedures assume that you have completed all steps listed in the previous sections of this book.

Note: With ESX Server 2.1 you do not have to disconnect the server from the SAN before starting the installation any more (as with previous versions). The setup will ignore SAN attached disks during the installation.


� Insert the ESX Server 2.1 CD in the CDROM drive of the BladeCenter chassis.

� Ensure that the CDROM is owned by the correct blade.

� Power on the blade.

The VMware Welcome screen is displayed as shown in Figure 12-26 .

Figure 12-26 VMware Welcome screen

� Simply press Enter when using an HS20 blade (you can also type text and the setup will go straight into text mode — in our case we just pressed Enter).

� When the screen in Figure 12-27 is displayed, select Use text mode by using the Tab key, and press Enter.

Figure 12-27 No USB mouse present

� The next screen (Figure 12-28) displays some general information and directs you the VMware Web site for the latest documentation.

Important: ESX Server 2.1 does not support IDE drives in RAID mode. If you require RAID support for your system drives, you have to use the SCSI expansion option instead.

Important: When using the HS40 blade, you have to type text at the boot: prompt, otherwise the server will hang.


Figure 12-28 ESX Server support site

� Click OK.

Figure 12-29 custom install selection

� Select Custom as Installation Type as shown in Figure 12-29 — you will be able to configure your keyboard model for the correct country set-up.

Figure 12-30 Keyboard selection

� Select your keyboard model for the correct country set-up.

� Click OK.

� The screen in Figure 12-31 will be presented.


Figure 12-31 Mouse selection

� On the screen in Figure 12-31, select None, as you do not have any recognizable mouse connected.

� Select OK.

� You are now presented with the end user license agreement screen as shown in Figure 12-32.

Figure 12-32 End user license agreement


� Read through the end user license agreement and check Accept End User License to accept the terms in the license agreement, then tab to the OK box and press Enter.

� Enter the ESX serial number for your installation as shown in Figure 12-33. If you have purchased the ESX Virtual SMP option, then also enter the serial number for it.

Figure 12-33 ESX server and SMP Serial numbers

� Select OK.

� The Disk Partition Setup screen as shown in Figure 12-34 appears.

Figure 12-34 Disk Partitioning setup

� You can either select Manual or Automatic:

– Automatic will set up the main partitions automatically and you can still modify them to your specific needs (recommended if you are not familiar with Linux type disk setups).

– Manual requires you to set up all partitions from scratch

� Ensure that Remove all is selected (we assume a new system)

� Click OK.

� Accept the warning and continue with the removal of the partition.


� The screen in Figure 12-35 shows the partitions created after selecting Automatic on a blade with an IDE drive.

Figure 12-35 Hard Disk Partitioning

� Make any custom changes that your installation requires. Use the arrow keys to select the partition you wish to change, and the Tab key to select Edit to make your changes. Alternatively, to accept the default values for Partitioning, tab through OK and press Enter.

� The next screen (Figure 12-36) is the Network Configuration screen.

Important: Figure 12-35 shows only the default partitions, not the recommended values for a production setup. Please review your specific requirements beforehand and create a partition structure according to your needs.

Tip: It is good practice to add a /var partition of about 1GB, as log files (which are typically placed in /var) can quickly grow and fill up the system partition. A filled up system partition can result in unpredictable results!

We suggest to also add a VMimages partition (of at least 4 GB), which can be used for storing virtual machine images, mounting CD images, and so on.

If you are using SCSI drives to install ESX on a blade, also see Example 12-1 on page 237 for a partition sample.


.

Figure 12-36 Network Configuration

� It is recommend to use a fixed IP address for any server, press the space bar to unselect DHCP and tab to the Hostname field. Type in your server name and fill the remaining IP configuration information. Tab through to OK and press Enter.

� The Time Zone selection screen appears; select your time zone

Figure 12-37 Root Password

� The Root Password screen appears; specify a root password. Make sure not to use a trivial password that would compromise security.

� The Add User screen appears (Figure 12-38). Add at least one user.


Figure 12-38 Adding user ID’s

� After entering the User ID, Password and Full Name as seen in Figure 12-38, tab to OK and press Enter.

On the next screen you can enter additional User IDs (Figure 12-39). You do this by selecting Add and pressing Enter. It is easier to add them later, however, using the Users and Groups feature from the VMware Management Interface.

Figure 12-39 List of User IDs, you can add additional users later

� Tab to OK and press Enter when you are ready to continue.

The next screen (Figure 12-40) is for information; a complete log of the installation will be saved, at the location indicated.


Figure 12-40 Installation Log Location

� Tab to OK and press Enter.

The VMware ESX Server 2.1 installation program will now format the hard drive, create the partitions, and begin installing the software. You should see a screen similar to Figure 12-41, reflecting the installation progress.

Figure 12-41 Installation progress screen

The last screen (Figure 12-42) confirms that the installation of ESX Server is complete.

Figure 12-42 ESX Server 2.1 server installation complete

� Your installation is now complete; press Enter.

Note: Eject the CD manually, as the ESX Server installer will fail to eject it automatically.


After the server has rebooted, you should see a screen similar to the one in Figure 12-43, notifying you that the ESX Server is now ready to be configured.

Figure 12-43 VMware ESX Server console screen

� From your management workstation open a Web browser session and type the ESX Server IP address as URL (you may have to disable the proxy in your browser settings), for example, http://9.43.231.23.

The first time you access the ESX Server via the Web browser, you are prompted to accept a Security certificate as shown in Figure 12-44.

Figure 12-44 Security certificate

� Accept the certificate by clicking Yes. The VMware Management Interface login dialog is displayed (Figure 12-45).

Attention: In text mode installation, the configuration of the Startup Profile (assignment of devices to either the console OS, virtual machines or shared between the two) is not done during the installation (as is the case with the Graphical Install). The installation is not complete until this configuration step has been performed.


Figure 12-45 Login to Vmware Management Interface

� Enter the root ID and password, then click Log In.

Figure 12-46 Cancel the wizard

� After cancelling the wizard you are presented with the main Status Overview screen, shown in Figure 12-47.

Attention: On the first login (text mode only) you will be presented with a wizard that takes you through the configuration steps.

When using a BladeCenter, some additional steps are required which are not included in the wizard (specifically networking setup). Therefore, it is better not to use the wizard. Please cancel the wizard by clicking Cancel on the screen shown in Figure 12-46. Then, follow the steps provided.


Figure 12-47 VMware Management Interface incomplete installation swap space warning.

� Click the Options tab to get to the VMware ESX Server Options screen as seen in Figure 12-48.

Figure 12-48 VMware Management Interface - Options

Attention: Even though the screen in Figure 12-47 shows warnings (yellow triangles) indicating unconfigured swap space and network (virtual ethernet switches), we will configure those later on. First, we need to set up the Startup Profile.


� Click Startup Profile...

The screen that is displayed allows you to assign devices to either the ESX Service Console, for exclusive use with the virtual machines, or shared between the two (except network adapters). You get a screen similar to the one shown in Figure 12-49, which presents you with all of the PCI devices detected and available to be allocated for use with ESX Server.

Figure 12-49 Startup Profile

Browse the Startup Profile screen, and confirm that all your SCSI, Qlogic, and Ethernet adapters are shown:

� If you wish, check the box to enable Hyper-threading.

� Select the correct amount of memory for your service console, depending on the amount of virtual machines you plan to run.

� The first Ethernet adapter detected will by default be dedicated to the Service Console. Do not change this, otherwise you are likely to lose management connection to your ESX Server.

Tip: We recommend to select an amount higher than actually required, as it is not uncommon that virtual machines are added later on, requiring more memory.

Attention: We will discuss the actual network configuration of the blade server in 12.9.2, “Configuring the virtual network on HS20 blade servers” on page 252. For this initial configuration, leave the current settings unchanged.


� If you install ESX Server on blades with SCSI drives (using the SCSI expansion option), enable the SCSI controller for shared use between virtual machines and Service Console by selecting Virtual Machines and checking the Shared with Service Console box.

� Set all Qlogic adapters to be dedicated to the Virtual Machines, do not enable the Share with Service Console check box.

� Assign the remaining Ethernet adapter(s) to the virtual machines, unless you plan to implement multiple adapters for the console OS (not typical).

� After completing configuring the Startup Profile options, click OK to continue.

A system reboot is required (Figure 12-50).

Figure 12-50 Reboot screen

� Click OK. The Standby screen is presented (Figure 12-51).

Attention: Enabling the SCSI controller for shared use enables you later on to place the ESX Server swap file on the internal disks (rather than the SAN), which is recommended.

Note: The IDE controller is not listed in Startup Profile. This is because IDE controllers cannot be used by vmkernel devices (for example, assigned for use with virtual machines).

That results in the IDE drives not being able to host any VMFS file systems. With IDE based systems, therefore, you cannot place the ESX swap file on the local drives; you will have to create it later on the SAN drives.


Figure 12-51 Standby screen

� Let the server boot with the default boot option in the LILO screen as shown in Figure 12-52.

Figure 12-52 LILO Boot Menu

� After the reboot, you are presented with the screen in Figure 12-53 (displayed on the ESX server, not the management system).

Figure 12-53 Welcome screen

Note: You might see error messages on the ESX Server console related to the swap file; you can safely ignore them at this stage (as we have not yet configured the swap space).


� On the management station, you don’t need to click OK: just wait until the system has rebooted, then accept the security warning by clicking Yes, and the login will appear.

Your server is now ready to be configured remotely.

12.7.2 Installation of ESX using the Graphical Installer This procedure will guide you through the installation procedure using the Graphical Installer.

� Insert the ESX CD and power on the server.

� The VMware welcome screen is displayed as shown in Figure 12-54.

Figure 12-54 Welcome screen

� Press Enter to start the ESX Server installation.

� After completion of the hardware inspection, the graphical Welcome to VMware ESX Server screen displays, as shown in Figure 12-55.

Attention: The installation of ESX Server is now complete.

Next, proceed with the “Configuring the ESX Server swap file” on page 242.

Note: As described in 12.7, “ESX Server installation” on page 215, you can use these instructions for a blade server installation only if you have an additional USB mouse attached to the BladeCenter chassis.

BladeCenter Specifics: Ensure that the CD-ROM is owned by the correct blade.


Figure 12-55 ESX 2.1 GUI installation welcome screen

� Click Next to continue, and you will get the installation screen (Figure 12-56).

Figure 12-56 Custom installation

� Choose Custom as shown in Figure 12-56 if you want to change keyboard or mouse settings, and click Next.


Figure 12-57 Keyboard setup

� Review and make adjustments to your keyboard configuration if required (Figure 12-57), and click Next.

Figure 12-58 Mouse setup

� Review and make adjustments to your mouse configuration if required (Figure 12-58), and click Next.


Figure 12-59 License acceptance

� Read the license agreement (Figure 12-59), then check the acceptance box and click Next to continue.

Figure 12-60 ESX Server Serial number entry

� Enter the ESX Server serial number for your installation as shown in Figure 12-60. If you have purchased the VMware Virtual SMP option, then also enter the serial number for it. Click Next.


The next screen allows you to assign devices to either the ESX Server Service Console, for exclusive use with the virtual machines or shared between the two (except network adapters). You get a screen similar to the one in Figure 12-61 displaying all the detected PCI devices available for use with ESX Server.

Figure 12-61 Device Allocation sample

� Select the correct amount of memory for your Service Console, depending on the amount of virtual machines you plan to run.

The first detected ethernet adapter will by default be dedicated to the Service Console. You cannot change this.

� Enable the SCSI controller used for the ESX Server boot drives (typically the onboard SCSI controller or ServeRAID™ adapter) for shared use between virtual machines and Service Console by selecting Virtual Machines and checking the Shared with Service Console box

Note: The screen in Figure 12-61 is only an example; your actual screen could show completely different devices depending on your actual hardware configuration (for instance in production systems you would normally have at least two network adapters installed).

Tip: We recommend that you select an amount higher than the amount you actually require, as it is not uncommon that more memory becomes necessary when more virtual machines are added.


� Set all Qlogic adapters to be dedicated to the Virtual Machines; do not enable the Share with Service Console check box.

� Assign the remaining Ethernet adapter(s) to the virtual machines, unless you plan to implement multiple adapters for the console OS (not typical).

� After completing configuring the Startup Profile options, click Next to continue.

You are presented with the partitioning screen shown in Figure 12-62.

Figure 12-62 Hard Disk partitioning

� You can either select Manual or Automatic:

– Automatic will set up the main partitions automatically. You can modify them to your specific needs (recommended if you are not too familiar with Linux type disk setups)

– Manual requires you to set up all partitions from scratch

� Ensure that Remove all partitions is selected (we assume a new system); click Next.

Attention: Enabling the SCSI controller for shared use enables you later on to place the ESX server swap file on the internal disks (rather than the SAN), which is recommended.

BladeCenter Specifics: The IDE controller is not listed in Startup Profile. This is because IDE controllers cannot be used by vmkernel devices (for example, assigned for use with virtual machines).

That results in the IDE drives not being able to host any VMFS file systems. With IDE based systems, therefore, you cannot place the ESX Server swap file on the local drives, you will have to create it later on the SAN drives.

BladeCenter Specifics: We will discuss the actual network configuration of the blade server in 12.9.2, “Configuring the virtual network on HS20 blade servers” on page 252. For this initial configuration, leave the settings unchanged.


Figure 12-63 Confirm remove all partitions

� Click Yes to accept the warning (Figure 12-63) and continue with the removal of the partition.

Figure 12-64 Default Partition Allocation

The screen in Figure 12-64 shows the partitions created after selecting Automatic Partitioning on a system with internal SCSI drives.


Example 12-1 Partition sample setup

In this example we have used the Automatic Partitioning option during the setup, as shown in Figure 12-64 on page 236, and then modified the partitions.

We added a /var partition of 1 GB, which is good practice considering that log files (which are typically placed in /var) can quickly grow and fill up the system partition. A filled-up system partition can lead to unpredictable results!

We also added a VMimages partition that can be used for storing images, mounting CD images, and so on.

To create space, we had to remove and recreate the existing VMFS partition.

� Make any custom changes that your installation requires, then click Next to display the Network Configuration screen (Figure 12-65).

Important: Figure 12-35 on page 220 shows only the default partitions, not the recommended values for a production setup. Please review your specific requirements and create a partition structure according to your needs.

Tip: The following example shows a suitable partition setup that could be used.


Figure 12-65 Network Configuration

� Enter the ESX server name and IP information in Figure 12-65, then click Next.

Figure 12-66 Time Zone configuration

� On the screen in Figure 12-66, select the correct time zone for your installation, then click Next. You should get the Account Configuration screen (Figure 12-67).


Figure 12-67 Account Configuration

� Enter the Root Password and create at least one additional User ID to manage the Virtual Machines (Figure 12-67). Click Next to continue.

Figure 12-68 Installation is about to start

� Acknowledge the screen in Figure 12-68 by clicking Next; the installation process begins.

Attention: The password entered for Root should not be a trivial password.


After the setup program has copied all required files, you should see the completion screen in Figure 12-69.

Figure 12-69 Installation complete

� The VMware installation is complete. Click Next to reboot.

During the reboot, VMware will display this screen (Figure 12-70), the default boot will be ESX.

Figure 12-70 LILO Boot Menu

After the system has booted, you see the screen shown in Figure 12-71. Your server is now ready to be remotely configured.


You need to eject the CD manually, as ESX server will not automatically do it.


Figure 12-71 VMware welcome screen

Attention: The installation is now complete.

Proceed to 12.8, “Configuring the ESX Server swap file” on page 242.


12.8 Configuring the ESX Server swap fileThe ESX Server swap file is specifically used for advanced memory functions when you allow your virtual machines to use more memory in total than is physically available on the server. The swap file has to reside on a VMFS file system. It is always recommended to create the swap file locally (if possible).

� On your management workstation, open a Web browser session and type the ESX server IP address as the URL (you may have to disable the proxy in your browsers settings), for example, http://9.43.231.23.

Figure 12-72 Security certificate

� When the Security Alert pops up (Figure 12-73), accept the certificate by clicking Yes.


If you are installing ESX on a blade server with IDE drives, you will not be able to install the swap file locally as IDE drives cannot host VMFS file systems. You have to set up the Swapfile as well as the on a VMFS partition on the SAN. (The VMware Core Dump is partition where ESX dumps core memory in case of a server crash to allow debugging).

This is officially supported by IBM and VMware.

We therefore recommend to create a separate logical drive for the swap file of each blade server. The size should at least be equal to the amount of physical memory + 100MB (for the VMware Core Dump) and the drive should be mapped in Storage Manager to this server only. Review Figure 12-15 on page 209 for an example of storage partitioning such a configuration.


� Enter the root ID and password as shown in Figure 12-73 and click Log In.

Figure 12-73 Login to Vmware Management Interface

You will then be presented with the Status Monitor screen as shown in Figure 12-74.

Figure 12-74 VMware Management Interface - swap space warning

� Observe the yellow warning triangles shown in the top left hand corner in Figure 12-74.

The top one warns you that no swap space has been configured yet.


� If you are installing ESX Server on a blade server with IDE drives, go to 12.8.1, “Swap file: required steps for blade servers with IDE drives” on page 244.

� For all other servers, continue with the instructions in 12.8.2, “Creating and activating the swap file” on page 247.

12.8.1 Swap file: required steps for blade servers with IDE drivesInstructions given in this section only apply when installing ESX Server on a blade server with IDE drives.

To create a VMFS partition for the ESX Server swap file:

� Click the Options tab and you will see the screen shown in Figure 12-75.

Figure 12-75 Options screen


As mentioned, the ESX swapfile requires free space on a VMFS file system.

If you are installing ESX on a blade with IDE drives, no VMFS partition has been automatically created during the installation. You will therefore have to create a VMFS partition manually before you can configure the swap file.


� Click Storage Management; the Storage Management screen is displayed as in Figure 12-76.

Figure 12-76 Available LUNs

� Verify that you can see all the LUNs that were assigned to this ESX Server; if not, check your zoning and FAStT Storage Manager Partitions.

� Identify the disk you created for the swap file of this server.

� Click Create Volume.

Figure 12-77 VMFS Volume creation

Tip: ESX Server can dynamically detect added LUNs; you do not have to reboot the server. If you had not connected the server to the SAN, simply connect it now and click Rescan SAN and Refresh to display the LUNs.

Note: Using the vmkfstools -s vmhba[x]command (where [x] is the number of the vmhba) you can also issue a rescan from the console OS command line. Due to a slight bug in ESX 2.1, however, you need to issue this sequence of commands (example for scanning vmhba0):

1. wwpn.pl -s




You are now presented with the choice for Typical or Custom as shown in Figure 12-77:Typical Creates a single VMFS volume using all the available space. Custom Allows you to create multiple VMFS Volumes of varying sizes.

� Select Typical.

Figure 12-78 Creation of Core Dump Partition

� The wizard will detect that no Core Dump Partition is present as shown in Figure 12-78 and ask you if it should create one; confirm by clicking Yes.

Figure 12-79 VMFS Logical drive name

� Enter a unique name for the VMFS volume, such as SWAP_ESX1 (in our example, we simply called it VMFS_Logical_Drive_1, as shown in Figure 12-79). Then click OK.

Tip: For shared VMFS configurations, use Typical, as you can only have one VMFS partition per LUN.


Figure 12-80 Core Dump and VMFS created

� Review that the VMFS Volume and the VMware Core dump have been created (Figure 12-80). Click Close to close the window.

You have now successfully set up your VMFS partition and can continue with the creation of the swap file as described in 12.8.2, “Creating and activating the swap file” on page 247.

12.8.2 Creating and activating the swap fileTo create and activate the swap file:

� Click Reconfigure... associated with the swap space warning as shown in Figure 12-81 (you can also select the Options tab and click Swap Configuration... as shown in Figure 12-75 on page 244).

Figure 12-81 Reconfigure... swap space


You will be presented with a screen as shown in Figure 12-82 indicating that no swap file is activated or configured.

Figure 12-82 Swap Configuration

� Click Create.

Figure 12-83 Creating the swap file

� Clicking on the VMFS Volume pull-down tab will allow you to choose on which VMFS (Figure 12-83) you want to create the swap file. Ensure that the correct one is selected.

� Adjust the swap file name if required (not typical).

� Accept the recommended file size.

� Ensure that the Activation Policy is set to Activate at system startup.

� Click OK.


Figure 12-84 Swap configured but not activated

Now that the Swap file has been created, you need to activate it:

� Click Activate as shown in Figure 12-84.

� Verify that the Swap file now shows as activated; click OK.

� Select the Status Monitor tab.

The warning referring to the unconfigured swap file has disappeared, as can be seen in Figure 12-85.

Figure 12-85 VMware Management Interface

You have successfully created and activated your ESX Server swap file; continue with 12.9, “Configuring the virtual network” on page 250.


12.9 Configuring the virtual networkThis section describes the setup of the virtual network. It only covers the basic concepts, as specific configuration details completely depend on your network infrastructure and requirements.

12.9.1 Configuring the network for all systems except HS20 blade servers� Open the Status Monitor window as shown in Figure 12-86. Notice the warning regarding

No virtual ethernet switches found.

Figure 12-86 Warning for unconfigured network

� Click Reconfigure... (alternatively, open the Options window as seen in Figure 12-75 on page 244 and select Network Connections...) and you will get a screen similar to the one shown Figure 12-87.

Attention: We will, however, cover in more detail the network configuration for the HS20 blade server, as it requires specific considerations:

� If you configure networking for HS20 blade servers go to 12.9.2, “Configuring the virtual network on HS20 blade servers” on page 252.

� For all other configurations continue with 12.9.1, “Configuring the network for all systems except HS20 blade servers” on page 250.

Important: This chapter assumes that you have decided what network configuration you need (number of physical network adapters, load balancing type and failover type, VLAN configuration, and so on).

Please review 10.1.8, “Network considerations” on page 175 for details and recommendations on your network configuration.


Figure 12-87 Configuring the Virtual Ethernet Switch

� Select the Physical Adapters tab and review the list of adapters displayed. You should see all adapters you assigned for use with the virtual machines.

� Click the Virtual Switches tab.

� Create a virtual switch by clicking Create. A screen as shown in Figure 12-88 displays.

Figure 12-88 Creating a virtual network switch


� Specify a Network Label or accept the default one.

� Check the boxes of the physical adapters you want to include in this virtual switch (see “Virtual switches” on page 175 for details on the implications).

In this example illustrated by Figure 12-88, we connected the two available adapters to the virtual switch, forming a fault tolerant bond of adapters.

� Confirm the creation by clicking Create Switch.

� You should now see a screen similar to Figure 12-89.

Figure 12-89 Configured virtual network switch with two adapters

� Review the settings.

� Repeat the above steps for any additional virtual switches you might want to create.

At this point you can also create Port Groups (VLANs) for your network (not covered here).

� Click Close when you are finished with the network configuration.

� Review the Status Monitor window (as shown in Figure 12-86) — no warning should be displayed any more.

Your network is now configured, continue with 12.10, “ESX Server advanced settings and Qlogic parameters” on page 262.

12.9.2 Configuring the virtual network on HS20 blade serversThe following section describes how to configure the network for ESX Server 2.1 on an HS20 blade server. As the HS20 only has a maximum of two network ports when configured with a Qlogic daughter card, follow the instructions given in this section to achieve fault tolerance for all traffic (including console OS, virtual machines, and VMotion traffic).

Note: Network Labels are a very useful way to assign “friendly names” to your network segments. This is especially important if you use VMotion, where the same name needs to exist on all ESX Servers you want to move virtual machines to and from.


We will reassign the adapters, load a module allowing us share the adapters between the console and the virtual machine, create a bond of the two adapters, and configure them for securely separated traffic for console OS and virtual machines.

Reassigning the network adaptersDuring the initial device configuration, the first network adapter was assigned for exclusive access through the console OS. We will now change this device allocation using the vmkpcidivy command.

Perform the following steps from the ESX Server console OS (either locally on the system or remotely using the management module or a tool like putty):

� Log into the console OS and type at the command line:

vmkpcidivy –i

A screen as shown in Figure 12-90 displays.

Tip: With the original HS20 blade servers (model 8678 and older BIOS), the integrated Broadcom port connected to the bottom ESM initializes first; consequently, vmnic0 is actually connected to the bottom ESM (switch2), while vmnic1 is connected to the top ESM (switch1) — which is somehow confusing. This was redesigned and fixed with later blade models (8832), and vmnic0 is now connected to the top ESM (switch1) and vmnic1 to the bottom ESM (switch2).

Q: So what if you want to mix older and newer blades and want to ensure consistency regarding network paths?

A: With the later BIOS versions (1.05 and beyond) you can change the initialization order of the NICs on the older blades.

Attention: To avoid network configuration problems, it is recommended to disconnect the BladeCenter chassis from the network until the steps in this section have been completed. Please use a direct attached management station if you want to perform any additional steps.

Tip: vmkpcidivy is a very useful command which, among other things, lets you reassign devices to the console OS or virtual machines. This is useful, for instance, if you misconfigured startup devices and cannot get a connection to the management interface (for example, if you accidently removed or changed the console OS network adapter or removed the boot device from the console).


Figure 12-90 vmkpcidivy -i

� Accept the defaults for all settings by pressing Enter (several times) until the first network adapter is shown

The first network adapter will be marked [c] as it is assigned to the console OS.

� Reassign the adapter to be shared between the console OS and the virtual machines by typing s and pressing Enter as shown in Figure 12-91.

Figure 12-91 vmkpcidivy -i cont.


� Verify that the second network adapter is assigned to the virtual machines [v].

� Accept the remaining defaults by pressing Enter.

� Save your changes as shown in Figure 12-92.

Figure 12-92 vmkpcidivy -i cont.

Important: You must set the first adapter to shared.

Logically it would make sense to assign both adapters to the virtual machines and then share them with the console by using the vmxnet_console module (as explained in “Implementing the vmxnet_console module” on page 257)

However, setting one adapter to shared is actually a necessary workaround for an issue with the current VMware ESX Server code caused by the way the console IP address is assigned.

Attention: Do not reboot at this point (even though you are asked to do it).

Tip: You can verify/review the changes with the management interface by selecting the Options tab and clicking Startup Profile.... You will see that the first network adapter is now set to shared (a setting that cannot be set using the MUI) as shown in Figure 12-93.


Figure 12-93 Verifying adapter assignment using the MUI

Creating a bondWe now create a bond (adapter pair) between the two network adapters. For that purpose, we have to edit the file /etc/vmware/hwconfig. Changes made to the file are permanent, that is, they remain effective after a reboot.

Perform the following steps from the ESX Server console OS (either locally on the system or remotely using the management module or tools like putty).

� Log into the console OS and type at the command line:

vi /etc/vmware/hwconfig

� Scroll down to the end of the file using the PgDn key and press Insert.

� Add the following two lines at the end of the file (as shown in Figure 12-94):

nicteam.vmnic0.team = "bond0"nicteam.vmnic1.team = "bond0"


Figure 12-94 Editing hwconfig to enable bond

To do so, add this one additional line to the hwconfig file:

nicteam.bond0.home_link = “vmnic0”

When using blade models 8832, this will set the NIC connected to the top switch as the active adapter.

Save the changes and exit the editor.

Implementing the vmxnet_console moduleThis section describes how to use the vmxnet_console module to enable network devices assigned to the virtual machines and enable the bond for the console OS and virtual machines.

Perform the following steps from the ESX Server console OS (either locally on the system or remotely using the management module or tools like putty).

� Log into the console OS and enter the following at the command line:

vi /etc/rc.local

� Scroll down to the end of the file using the PgDn key and press Insert

Tip: By default, both network adapters will be active members of the bond; traffic will be load balanced using MAC address load balancing (distributes networking traffic based on the MAC hardware addresses of the source network adapters over both adapters).

However, this can cause issues in certain network configurations (as the common MAC address is broadcasted on both adapters, which can cause problems, for instance, in redundant upstream switch configurations. In this case you might want to enable an active/passive relationship, with one adapter being the home adapter (active).


� Add the following lines at the end of the file:

# vmxnet_console through bond0/etc/rc.d/init.d/network stoprmmod vmxnet_consoleinsmod vmxnet_console devName=bond0/etc/init.d/network startmount –a

When done, save the changes and exit the editor:

� Reboot the server for all the changes to take effect.

� After the reboot, open the management interface and log in as root.

� Click the Options tab and you will see the screen shown in figure Figure 12-95.


� Click Network Connections to display the screen shown in Figure 12-96.


Figure 12-96 Configured virtual network switch with two adapters

� Review the settings, and note that the bond of two adapters manually created in the previous steps is now reflected in the MUI.

� Adjust the Network Label of the virtual switch if required by clicking Edit ...

� Once you have reviewed the network settings, click Close.

Our test configurationFor our test we had the BladeCenter connected to an upstream Cisco switch (two separate segments for redundancy). The test client was then connected to one of the segments of the Cisco switches. See Figure 12-97 for a graphical representation.

Note: Network Labels are a very useful way to assign “friendly names” to your network segments. This is especially important if you use VMotion, where the same name needs to exist on all ESX servers you want to move virtual machines to and from.

Tip: If you do not have to configure VLANs, you are now done with the network configuration from a VMware ESX Server standpoint.

However, you also have to configure your network switches, and what you need to do is really dictated by your specific network requirements and infrastructure. To give you a basis to work from, we have documented the configuration we used during our testing.

Note: This configuration is neither optimized, nor can we guarantee that it will work in your environment.


Figure 12-97 Network setup

Here is how we configured the switches:

� Reset each of the ESMs to default settings (this will ensure that only the default management VLAN1 without VLAN tagging is present).

In our configuration, each ESM has two links configured as a trunk going into one of the Cisco segments:

� Configure the ports on each of the ESMs as trunk ports (if multiple connections are used for redundancy as in our case); see Figure 12-98 for an example.

1

2

3

4

5

6

Catalyst4006

Power Supply 3Power Supply 1 Power Supply 2

WS-X4013

SUPERVISOR II ENGINE

STATUSRESET

UPLINK UPLINK

21

UTILIZATION

100% 1%UPLINKS ENABLED

CONSOLE BASE-TX10/100

Catalyst 4 9 1 2 G

STATUS

1 2 3 4 5 6 7 8 9 10 1% 100%

UTILIZATION

11 12 PS1 RPSU

CONSOLE 10 BAST

Catalyst 4 9 1 2 G

STATUS

1 2 3 4 5 6 7 8 9 10 1% 100%

UTILIZATION

11 12 PS1 RPSU

CONSOLE 10 BAST

Trunk

Trunk

ClientAccess

BladeCenter

FAStT


Figure 12-98 Enable trunking on external ports

� Enable switch failover on each of the ESMs (if you require switch failover in both directions) as shown in Figure 12-99.

�

Figure 12-99 Enabling switch failover

� Configure your upstream switch (in our case Cisco) so that each of the incoming trunk links are configured as trunk ports.


In our setup, we used the configuration as described in the previous section with vmnic0 being the home NIC (active NIC), so that all traffic, by default, went through the top ESM (switch1).

We tested cable failures while having incoming and outgoing ping processes running. The system survived single cable failure (of the active path) within the trunk as well as complete trunk failure (both cables pulled) by failing over to the bottom ESM switch (switch2). Even then the active cable was pulled of the surviving trunk so that only one out of 4 original connection remained. The systems continued to communicate in both directions.

Note: We observed that the failback times (when cables were reconnected) were longer than the actual failover times.

12.10 ESX Server advanced settings and Qlogic parametersThe advanced settings for the ESX Server and the Qlogic driver given in this section are required when attaching to a FAStT.

12.10.1 ESX Server advanced settingsWhen connecting to a FAStT controller, you need to adjust some parameters on each ESX Server. Please follow these steps to configure each server:

� Open the VMware Management Interface.

� Click the Options tab and select Advanced Settings.

� Change the settings as listed in Table 12-1, by clicking on the current setting:

Table 12-1 ESX Server settings for FAStT

You should see something like the picture in Figure 12-100.

Attention: At this point you can also create Port Groups (VLANs) for your virtual machines. Whether you need to configure VLANs depends completely on the required network configuration. If you are unsure, it can always be implemented later.

Note, however, that due to the added complexity, we strongly recommend to implement VLANs only if really required.

Disk.UseDeviceReset 0

DiskUseLunReset 1

Tip: These changes allow the ESX Server to reset the SAN connections on a LUN level rather than the FAStT controller level, therefore minimizing impact on other LUNs.

These settings are specifically important for clustering configurations.


Figure 12-100 Adjusted ESX settings

� Repeat this procedure for each ESX server.

12.10.2 Adjust Qlogic driver parametersYou need to adjust some Qlogic driver settings on each ESX server in order for all VMware configurations to work correctly. Please follow these steps:

� Open a console session either locally on the ESX server or simply use a tool like putty to open a remote SSH session.

� Login as root.

� To edit the appropriate config file type vi /etc/vmware/hwconfig .

Figure 12-101 Editing the hwconfig file using a remote putty.exe session

� Search for the first instance of your Qlogic adapter (QLogic Corp QLA2300 64-bit as seen in Figure 12-101 and write down the device number x.y.z., in our case for example 1.1.0.

Important: You just need to make these changes on the first instance of an HBA when you have multiple HBAs installed. The changes are driver-wide and will affect all instances of HBAs using the same driver/


� Scroll down and locate the line device.lilo_name.x.y.z.owner = “VM” - where lilo_name is the name of the VMware ESX Server name in the lilo screen and x.y.z is the device number you wrote down for the first instance of the QLogic adapter. In our case the line is device.esx.1.1.0.owner = "VM".

� Add (pressing the Insert key) the following line to set the new values for the QLogic driver:

device.lilo_name.x.y.z.options = "qlport_down_retry=10 qlloop_down_time=90"

Where lilo_name is the name of the VMware ESX Server startup name in the lilo screen and x.y.z is the device number you wrote down for the first instance of the QLogic adapter.

In our case, as shown in Figure 12-102, we added this line:

device.esx.1.1.0.options = "qlport_down_retry=10 qlloop_down_time=90"

Figure 12-102 Adding Qlogic driver options in hwconfig

� After adding the line, press Esc → : → w → q to save and exit the editor.

� Reboot the server.

� Repeat this procedure on each ESX Server.

Tip: It actually does not matter where exactly you insert this line, but it is good practice to keep the structure of your hwconfig file clear and consistent.

Tip: Both values are by default set to 30. By decreasing the time the driver tries to reestablish the link if the port goes down, the failover occurs sooner in case of a path failure.

Tip: You can verify if the settings have been applied successfully using the following command:

cat /proc/scsi/qla2300/x

Where x is the number of your vmhba (typically 0 and 1 or 1 and 2).

The Port down retry value should have changed to 10 for all HBAs as shown in Figure 12-103.

With ESX Server 2.1 there is no indicator that the loop downtime has been changed successfully; this situation is expected to change with ESX Server 2.1.1.


Figure 12-103 Verify new Qlogic settings

12.11 Planning disk resources and creating VMFS partitionsThis section helps you plan and review your virtual disk configuration. It is also a guide on how to create VMFS partitions to store virtual machine files (dsk files).

12.11.1 Considerations and guidelines While it is impossible to provide generic recommendations suitable for any virtual disk configurations, we provide basic guidelines and inform you of restrictions:

� If you configure your system(s) for VMFS sharing (for instance, VMotion), go to “Considerations for shared VMFS configurations: VMotion” on page 265 and review the listed guidelines before proceeding.

� If you configure your system for clustering, go to “Considerations for cluster configurations” on page 266 and review the listed guidelines before proceeding.

Considerations for shared VMFS configurations: VMotionThese considerations apply when attaching to a FAStT and multiple ESX servers will have access to the same VMFS file system (either for high availability reasons or VMotion). Each server needs to have access to the shared LUN as described in 12.5, “LUN configuration and storage partitioning” on page 207.

The following are recommended guidelines with ESX Server 2.1 and VirtualCenter:

� Sharing VMFS partitions require the partition to be formatted as VMFS-2 and the mode set to public (both settings are the default with ESX Server 2.1).

Important: The structure of your VMFS partitions depends on many factors, such as:

� Type of implementation, for example, clustered or non-clustered configuration� Number of virtual machines� Expected performance (workloads requiring dedicated LUNs or different RAID levels

Please review Section 9.4, “VMware ESX Server storage structure: disk virtualization” on page 146 for details and recommendations.

Note: This section does not cover clustering, where the actual .dsk file is shared, not just the VMFS partition.


� Do not run more than 32 virtual machines (doing simultaneous disk-intensive activity) on a VMFS-2 volume shared between two or more hosts running ESX Server 2.1 (otherwise you might experience unacceptable performance).

� Do not exceed 100 virtual machines per VMFS file system, even if they are not disk I/O intensive. It is good practice to distribute your virtual machines over a reasonable number of VMFS file systems. If too many virtual machines share the same VMFS, the LUN might become too busy with locking operations to respond to certain functions of the virtual machines and the system will become unstable.

� Do not exceed 16 ESX hosts sharing the same VMFS partition.

� Although it is technically possible to have multiple VMFS volumes per LUN, it is only valid when only one host can access the LUN; Therefore, we strongly recommend a 1:1 mapping as best practice.

� For VMotion, it is allowed to have a virtual machine configured with dsk files residing in different VMFS partitions (for example, you can have a virtual machine with disk1.dsk on VMFS1 and disk2.dsk on VMFS2 (for performance reasons) and still do VMotion as long as both servers can see both VMFSs).

Considerations for cluster configurations

These are the cluster configurations we have tested and will cover in this section:

� Virtual machine cluster (cluster in a box)

� Split virtual machine cluster (cluster across two physical ESX Servers) using raw disks (not VMFS partitions)

� Physical to virtual machine (hybrid) cluster using raw disks

All tests were done using Windows 2003, but Windows 2000 clustering is supported as well. For a clustered Windows guest, you must also set a timeout value of 60 seconds (see 12.13.1, “Clustered Windows guest OS settings” on page 284).

Restriction: VMotion will not work with raw disks.

Attention: While we installed and tested all listed configurations, we will not detail here the step-by-step installation procedure for a cluster configuration; instead, we simply provide a checklist to help you implement these configurations.

Please refer to the ESX Server 2 Administration Guide (on installation CD) for a detailed step-by-step installation procedure for clustering.

Note: Our test was comprised of the installation and only basic functionality testing. By any means, it must not be considered a certification test. Please check with IBM support for actually supported configurations.

Important: Please ensure that you have reviewed 11.4.3, “Clustering” on page 192, as it explains the different ways you can implement a cluster configuration with VMware.


Local virtual machine cluster (cluster in a box)As two virtual machines running on the same ESX Server share dsk files, this configuration is typically used on systems with integrated storage, not SAN attached storage. Of course, it is technically possible to use SAN attached storage for this configuration.

Here are the rules and guidelines for local virtual machine cluster configurations:

� Only 2-node clustering is supported by VMware (even if the OS supports more nodes).

� LUNs used as cluster disks must be formatted as VMFS and use the standard VMFS label notation (used by default when creating a VMFS file system).

� All shared dsk files need to be associated with a separate virtual disk adapter which must be in shared mode virtual.

� Each LUN used for shared access must host only one VMFS filesystem (even though it is only virtual clustering).

� The VMFS volume must be dedicated to the cluster and the access mode must be public (that is, you cannot have dsk files on this VMFS which are not shared).

Split virtual machine cluster This configuration requires two physical ESX servers. You cluster virtual machines across the two servers to achieve redundancy against server failure.

If you decide to implement VMFS (thus a single path configuration, considering the restriction mentioned), the following rules apply for the shared disks:

� All shared dsk files must be created with the vmkfstools utility using the option -z.

� All shared dsk files must use the vmhbaH:T:L:P notation, not the VMFS label notation.

� Each LUN used for shared access must host only one VMFS filesystem.

� Each VMFS can only host a single shared dsk file.

� The VMFS cannot use VMFS spanning (VMFS volume sets).

� VMFS must be configured in mode shared (not public).

� All shared dsk files need to be associated with a separate virtual disk adapter which must be in shared mode = "physical".

Restriction:

With the ESX Server 2.1 built (#7728) used at the time of writing, it was not possible to use VMFS partitions in a split virtual cluster with dual path configurations (that is, ESX servers with multiple HBAs installed). The problem manifests itself by triggering cluster node failovers and other unpredictable behavior in case of fibre path failures.

You can, however, use single path configurations; that is. each ESX server has only one HBA installed.

Since it is not considered good practice to implement a single path configuration in a cluster, we have only tested the split VM cluster configuration using raw disks only.

If you use a version higher than 2.1.1, contact IBM support or VMware to verify if this problem has been resolved.


When using raw disks, verify that all shared dsk files are associated with a separate virtual disk adapter, which must be in shared mode = "physical".

Physical to virtual machine (hybrid) cluster using raw disksWith this configuration you cluster a node running on a physical server with a node running in a virtual machine on an ESX Server. This gives you the advantage of a cost attractive failover platform. Indeed, a single ESX server (running multiple virtual machines) can be the failover system for several physical cluster servers.

These are rules and guidelines for local virtual machine cluster configurations:

� All shared .dsk files need to be associated with a separate virtual disk adapter which must be in shared mode physical.

� All shared drives must be raw disks.

� The physical (Windows) server can have only one HBA installed (single path only).

� The FAStT can only have one controller blade.

� When configuring storage partitioning the ESX server needs to use host port type LNXCL while the Windows server needs to use the appropriate clustered Windows host port type, for example, Windows 2003 clustered for a server installed with Windows 2003 and MSCS. See Example 12-2 for an illustration.

Tip: Considering all the above restrictions using VMFS in a split cluster configuration, there are actually only very few reasons why you would want to use VMFS partitions versus raw disks. One of the main advantages of using VMFS in non-clustered configurations is the flexibility of disk partitioning and the use of advanced disk modes (such as undoable disks). With raw partitions, you lose this ability, as you have to use the full LUN (that is, you cannot partition it in any way). With FAStT this really isn’t a problem, as you simply can use Storage Manager to create your LUNs with the desired size.

As you can see from the above restrictions, however, in a split cluster configuration there has to be a 1:1 relationship between LUN, VMFS, and .dsk file anyway, so there is no disadvantage of using raw disks. Also, a cluster is typically implemented in a production environment, so disk modes are rarely used.

In conclusion, the only remaining disadvantage is the fact that raw disks are a bit more awkward to manage, as they cannot be given friendly names, and their order can change when adding or removing LUNs.

Note: In this configuration you cannot use VMFS partitions as the physical server (for instance running Windows 2003 with Microsoft Cluster Service (MSCS) cannot read VMFS partitions (as it does not have a file system driver for this VMware file system).

You therefore must use raw disks.


Example 12-2 Sample storage partitioning configuration for a hybrid cluster

The Storage Partitioning for this configuration needs to be implemented as follows:Create a host group that contains both the ESX Server and the Windows Server, with the LUNs for the Quorum drive and all other shared drives assigned to the host group (not to the individual hosts) so that both the ESX Server and the Windows Server can access these LUNs as shown in Figure 12-104. Note that the host ports of the ESX server are configured as host port type LNXCL while the Windows 2003 server uses host port type Windows 2003 clustered.

Figure 12-104 LUNs mapped to host group

The logical drives used for the virtual machine system disk (non-clustered, typically VMFS) are assigned to ESX Server (not the host group) as shown in Figure 12-105.

Figure 12-105 LUNs mapped to ESX server

The Windows 2003 server will boot from local disks and therefore sees only the shared drives assigned to the host group.

Important: Because of the restrictions, such as one HBA only and one FAStT controller blade only, we do not recommend this configuration.


12.11.2 Creating VMFS partitions

� Click the Options tab to display the screen shown in Figure 12-106.


Note: Before following these instructions, make sure you have read 12.11.1, “Considerations and guidelines” on page 265 to determine that you want to use VMFS partitions, not raw LUNs.


� Click Storage Management, and the Storage Management screen is displayed as shown in Figure 12-107.

Figure 12-107 Available LUNs

� Verify that you can see all the LUNs that were assigned to this ESX Server; if not, check your zoning and FAStT Storage Manager partitions.

Note: The information displayed on the screen presented in Figure 12-107 will depend on your disk configuration. For instance, if you install ESX server on a blade server with IDE drives, you will not see the system partitions, as IDE devices are not listed.


� Click Create Volume.

Figure 12-108 VMFS Volume creation

You are now presented with the choice for Typical or Custom as shown in Figure 12-108.

Typical Creates a single VMFS volume using all the available space. Custom Allows you to create multiple VMFS Volumes of varying sizes.

� Select Typical or Custom (our example shows the creation using Typical).

Figure 12-109 VMFS Logical drive name

� Enter a unique name for the VMFS volume, for example, VMFS_Logical_Drive_1 (Figure 12-109), and then click OK.

� Repeat the above steps for all VMFS partitions you need to create.

Tip: ESX Server can dynamically detect added LUNs; you do not have to reboot the server. If you had not connected the server to the SAN, simply connect it now and click Rescan SAN and Refresh to display the LUNs.

Note: Using the vmkfstools -s vmhba[x]command (where [x] is the number of the vmhba) you can also issue a rescan from the console OS command line. Due to a slight bug in ESX 2.1, however, you need to issue this sequence of commands (example for scanning vmhba0):

1. wwpn.pl -s



Tip: For shared VMFS configurations use Typical as you can only have one VMFS partition per LUN


Figure 12-110 Core Dump and VMFS created

� Review the created VMFS Volume(s), and click Close to close the window (Figure 12-110).

You have now successfully set up your VMFS partitions and can proceed with 12.12, “Creating virtual machines” on page 273.

12.12 Creating virtual machinesThis section explains how to create virtual machines. How you configure your virtual machines is dictated by your requirements (guest operating system, virtual hardware requirements, function, and so on).

We selected the creation of a virtual machine running Windows 2003 as an example.


Perform the following steps to create a virtual machine:

� Open the management interface and log in (we log in with user vmuser).

� In the Status Monitor window (as shown in Figure 12-111), click Add Virtual machine.

Figure 12-111 Adding a vm

� A screen as shown in Figure 12-112 displays.

Figure 12-112 Select OS

� Select the correct OS and adjust the display name and location of configuration files of the virtual machine

Restriction: Creating virtual machines while logged on as root is possible, but not recommended. Especially for production environments, never use root for performing administrative tasks.


� Click Next.

Figure 12-113 Specify resources

� The screen shown in Figure 12-113 allows you to specify the resources assigned to the virtual machine:

– Specify the number of virtual processors to be used (1 or 2). To be able to select two CPUs you need at least two physical CPUs installed in the server and have the Virtual SMP option enabled (our system was a 1 CPU system with Hyper-threading turned on).

– Specify the amount of memory available to the VM.

– Check the Citrix Terminal Services box if you plan to run Citrix in the virtual machine (this will optimize performance for Citrix based workloads).

� Click Next to go to the next screen (as shown in Figure 12-114).

12.12.1 Creating the virtual disk resourcesBefore proceeding with this section, make sure that you have read 12.11, “Planning disk resources and creating VMFS partitions” on page 265 and 10.1.4, “VMFS or raw disks?” on page 172 and have determined what type of disk best fits your requirements and specific ESX Server implementation.

Tip: By default, there is just one folder for all virtual machines of one type (for example, one folder for all virtual machines running Windows 2003 Standard Edition). You might find it useful to have distinct folders per virtual machine to keep their respective files separated.


Figure 12-114 Select type of virtual disk

� If you configure a VMFS based disk file, go to “Configuring a disk on a VMFS partition” on page 276

� If you want to configure the virtual machine to use a raw disk, go to “Configuring a virtual machine to use a raw disk” on page 278.

Configuring a disk on a VMFS partition This is the process used for most system dsk files (for example, the C: partition on Windows) and standard data drives including VMotion.

On the screen shown in Figure 12-114, click Blank to be presented with the screen shown in Figure 12-115.

Figure 12-115 Create new VMFS based disk


� Configure the following values:

– Select the correct VMFS volume for your virtual disk.– Give it a meaningful name.– Specify the size of the disk.– Specify the Virtual SCSI Node.

– Select a disk mode; select Persistent for all disks unless you have a specific reason to select another disk mode.

� Click Next to get the screen shown in Figure 12-116 (the actual disk configuration will depend on your previous selection).

Figure 12-116 Finish device configuration

Note: If you want to use an already existing .dsk file (for example, created by .dsk file cloning of an existing virtual machine, choose Existing instead. In our example we describe the creation of a new .dsk file.

Note: The Virtual SCSI Node ID determines the SCSI adapter number and the SCSI ID the virtual disk will be presented on, for example:

� 0:0 — is virtual disk with SCSI I (so typically the first disk) on the first virtual SCSI adapter. Simply accept the default (0:0) for the system disk of the virtual machine.

� 1:2 — would be the second disk on a (separate) second virtual SCSI adapter (so the OS would see two separate SCSI adapters even though there might be just one physically installed). This is required when configuring clustering with MSCS, as MSCS requires the shared drives to be on a separate SCSI controller. So the first shared drive (for instance for the Quorum) would typically be ID 1:0.


� If you require additional virtual disks, click Add Device... and repeat the previous steps.

� Review your configuration and alter it if required.

� Click Close

� If you don’t use clustering, go to 12.13, “Guest OS specific settings and remarks” on page 284; otherwise, continue with 12.12.2, “Modifying the disk resources as shared drives for clustering” on page 279.

Configuring a virtual machine to use a raw diskUse this section only if you have decided to use raw disks:

� On the screen shown in Figure 12-114 on page 276, select System LUN/Disk; the screen shown in Figure 12-117 displays.

Figure 12-117 Assigning a raw disk

� Select the correct LUN, and ensure that it lists (Partitions: 0).

� Note that you cannot configure the size as you assign the whole drive.

� Specify the Virtual SCSI Node.

Note: You can always come back to this page and add devices later.

Tip: While it is possible to assign a LUN containing a VMFS partition also as raw disk, we always recommend to use a LUN for either VMFS or raw access, not both. This is because some configurations require exclusive access to the LUN (for instance, for clustered drives). This should not be a problem, since you can always size your LUNs appropriately, using the FAStT Storage Manager.

Note: The Virtual SCSI Node ID determines the SCSI adapter number and the SCSI ID the virtual disk will be presented on, for example:

� 0:0 — is virtual disk with SCSI ID (so typically the first disk) on the first virtual SCSI adapter. Simply accept the default (0:0) for the system disk of the virtual machine.

� 1:2 — would be the second disk on a (separate) second virtual SCSI adapter (so the OS would see two separate SCSI adapters even though there might be just one physically installed). This is required when configuring clustering with MSCS, as MSCS requires the shared drives to be on a separate SCSI controller. So the first shared drive (for instance, for the Quorum) would typically be ID 1:0.


� Click Next to get the screen shown in Figure 12-118 (the actual disk configuration will depend on your previous selection).

Figure 12-118 Finish device configuration

� If you require additional virtual disks, click Add Device... and repeat the previous steps.

� Review your configuration and alter it if required.

� Click Close.

� If you don’t use clustering, you have now configured your virtual machines and can go to 12.13, “Guest OS specific settings and remarks” on page 284; otherwise continue with 12.12.2, “Modifying the disk resources as shared drives for clustering” on page 279.

12.12.2 Modifying the disk resources as shared drives for clustering

Local virtual machine cluster using VMFS based disksWe explain here how to configure the disk resources for a node of a local virtual cluster using VMFS based disks as shared cluster drives.

Figure 12-122 shows a simplified disk sample configuration for such a setup. The system disk is configured as the first disk on the first virtual SCSI adapter on a VMFS partition - Virtual Disk (SCSI 0:0)

Note: You can always come back to this page and add devices later.

Note: As mentioned in “Considerations for cluster configurations” on page 266, we will only cover the local virtual cluster using VMFS based shared disks and split virtual cluster using raw disks.


The second virtual disk (SCSI1:0) is also VMFS based. This is our sample shared drive (in reality you would of course have multiple shared drives). Note that specifying this drive to be the first drive on the second SCSI adapter (1:0), also added the device SCSI controller 1 .

Figure 12-119 Sample config for clustering using raw disks before modifications

To change the SCSI adapter for the shared drives:

� Click Edit... on the appropriate adapter, in our case SCSI Controller 1.� Select virtual as shown in Figure 12-120.

Note: All virtual SCSI controllers are by default configured to not use any bus sharing (none). For clustering to work, this setting must be changed for the SCSI adapter of the shared drives only.


Figure 12-120 Setting SCSI Controller 1 to mode shared virtual

� Click OK and review your configuration. It needs to indicate virtual for the shared SCSI controller as shown in Figure 12-121.

Figure 12-121 SCSI Controller 1 configured for shared virtual

Note: Setting the adapter to mode shared virtual allows two virtual machines on the same physical ESX servers to share the disk files attached to this adapter.


Split virtual cluster and hybrid clustersThis section explains how to configure split virtual cluster and hybrid clusters using raw disks as shared cluster drives.

Figure 12-122 shows a simplified disk sample configuration for such a setup. The system disk is configured as the first disk on the first virtual SCSI adapter on a VMFS partition - Virtual DIsk (SCSI 0:0).

The second virtual disk (SCSI1:0) is a raw disk and is our sample shared drive (in reality you would of course have multiple shared drives). Note that specifying this drive to be the first drive on the second SCSI adapter (1:0), also added the device SCSI controller 1 .

Figure 12-122 Sample config for clustering using raw disks before modifications

To change the SCSI adapter for the shared drives:

� Click Edit... on the appropriate adapter, in our case SCSI Controller 1.

� Select physical as shown in Figure 12-123.

Note: All virtual SCSI controllers are, by default, configured to not use any bus sharing (none). For clustering to work, this setting must be changed for the SCSI adapter of the shared drives only.


Figure 12-123 Setting SCSI Controller 1 to mode shared physical

� Click OK and review your configuration. It needs to indicate physical for the shared SCSI controller as shown in Figure 12-124.

Figure 12-124 SCSI Controller 1 configured for shared physical

Note: Setting the adapter to mode shared physical allows two different physical servers to share the raw disk(s) attached to this adapter.


12.13 Guest OS specific settings and remarksIn this section we discuss settings required by the guest Operating systems of a VMware ESX installation, in relation to FAStT attachments and cluster configurations. We assume that you did the following tasks:

� Install the guest operating system according to the instruction of the OS documentation.

� Install the VMware tools (refer to the ESX Installation Guide (on ESX installation CD) for details.

12.13.1 Clustered Windows guest OS settingsThe following section describes the settings recommended for clustered Windows guests.

For all clustered Windows guest OSs, modify or add the registry value TimeOutValue:

1. Open the registry editor by clicking Start → Run... and type regedt32, then click OK.

2. Navigate to HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk\.

3. Check if a value TimeOutValue exists, if yes go straight to step 6.

4. If the value does not exist, click Edit → New → DWORD Value as shown in Figure 12-125.

Figure 12-125 Using regedt32 to create TimeOutValue

Important: First make sure that Microsoft cluster service has been installed. For details and background on this change see Microsoft Knowledge Base Article - 818877.


5. Name the new value TimeOutValue.

6. Double-click the TimeOutValue value and you will see a window as shown in Figure 12-126.

Figure 12-126 Editing the value

7. Modify the Value data to 60 as shown in Figure 12-126, click OK.

8. Close the registry editor

12.13.2 Installing MSCS 2003 on a hybrid clusterWhen installing clustering services with Windows 2003, the setup wizard checks by default if the configured nodes use similar hardware. When installing MSCS on a hybrid cluster (physical to virtual machine cluster), the wizard will not install with the default settings when trying to add the second node as shown in Figure 12-127.

Figure 12-127 Adding the second node

Instead it will display the error message visible in Figure 12-128.


Figure 12-128 Error on adding node

To get the proper configuration, select Advanced from the screen in Figure 12-127 on page 285 and select the option Advanced (minimum) configuration as shown in Figure 12-129 below.

Figure 12-129 Avoiding installation errors


Chapter 13. Redundancy by configuration

This chapter describes behavioral characteristics of a single server, observed by simulating component failure during our testing. Configuration changes for problem rectification are also listed.

For all Windows based virtual machines, disk I/O was generated using the Iometer utility. For all Linux based virtual machines copy batches were used to simulate disk I/O.

13

Note: The actual behavior is strongly influenced by the setting used during the installation; make sure that you have followed the instructions given in Chapter 12, “Installing VMware ESX Server” on page 197.


13.1 Single path configuration

Failing component

Protection expected?

Tested? Expected / observed behaviorSingle path configuration

ESX Server No No Complete failure occurs.

Path: HBA to switch

No No SAN based virtual machines fail.

Switch No No SAN based virtual machines fail.

Path: switch to FAStT

Yes Yes ESX path failover, LUN(s) move to alternative controller, Storage Manager indicates LUN Not On Preferred Path (LNOPP) for moved LUNs, VMware shows respective path as unavailable as shown in Figure 11-27. I/O pauses during failover (in our case ~1min), transparent to guest OS (becomes unresponsive for this short period then resumes -no errors observed).

After reinstating the path, VMware shows all paths as available again but does not move LUN(s) back to preferred path as MRU is set as failover policy. Therefore Storage Manager keeps indicating LNOPP. Select Storage Subsystem -> Redistribute Logical Drives to return LUNs to preferred owner.

FAStT Controller Yes Yes ESX path failover, LUN(s) move to surviving controller, remaining behavior observed is identical to component failure Path: switch to FAStT, as listed above.

Changing Controller Ownership of LUNs manually (Storage Manager)

Yes Yes Transparent to virtual machine, no visible I/O delay. VMware shows path to new controller owner as active, no error messages.

Changing preferred path using ESX MUI (Failover Paths)

Yes Yes Path does not change, LUN remains on preferred controller (as specified with Storage Manager)

Note: Changing ownership of LUNs/setting preferred paths should only be done using Storage Manager.


13.2 Dual path redundant configuration

Failing component


Tested? Expected / observed behaviorDual path redundant configuration

ESX Server No No Complete failure

Path1: HBA to switch

Yes Yes ESX HBA failover, LUN(s) remain on same controller, VMware shows two paths per HBA as unavailable as shown in ABC. I/O pauses during failover (for just a few seconds), transparent to guest OS (becomes unresponsive for this short period then resumes - no errors observed)After reinstating the path, VMware shows all paths as available again but does not move LUN(s) back to preferred path as MRU is set as failover policy. Failover paths in ESX MUI cannot be reinstated when I/Os are occurring - pause I/O before reinstating original path.

Switch Yes No Extension of Path: HBA to switch (same behavior expected)

Path2/path3: switch to FAStT controller A or B

Yes Yes ESX HBA failover, LUN(s) remain on same controller, VMware shows single path per HBA as unavailable as shown in ABC. I/O pauses during failover (for just a few seconds), transparent to guest OS (becomes unresponsive for this short period then resumes - no errors observed)After reinstating the path, VMware shows all paths as available again but does not move LUN(s) back to preferred path as MRU is set as failover policy. Failover paths in ESX MUI cannot be reinstated when I/Os are occurring - pause I/O before reinstating original path.

FAStT Controller Yes Yes ESX path failover, LUN(s) move to surviving controller, ESX HBA failover, Storage Manager indicates LUN Not On Preferred Path (LNOPP) for moved LUNs, VMware shows two paths per HBA as unavailable as shown in ABC. I/O pauses during failover (for just a few seconds), transparent to guest OS (becomes unresponsive for this short period then resumes - no errors observed)After reinstating the path, VMware shows all paths as available again but does not move LUN(s) back to preferred controllers as MRU is set as failover policy. Storage Manager also keeps indicating LNOPP. Select Storage Subsystem -> Redistribute Logical Drives to return LUNs to preferred controller ownerFailover paths in ESX MUI cannot be reinstated when I/Os are occurring - pause I/O before reinstating original path.

Chapter 13. Redundancy by configuration 289

Changing Controller Ownership of LUNs manually (Storage Manager)

Yes Yes Transparent to virtual machine, no visible I/O delay. VMware shows path to new controller owner (same HBA) as active, no error messages in Storage Manager

Changing preferred path using ESX MUI (Failover Paths)

Yes Yes HBA can be changed, but only to the other HBA that has a path to the same controller, so the LUN remains on preferred controller (as specified with Storage Manager)Note: Changing ownership of LUNs/setting preferred paths should only be done using Storage Manager.

Failing component


Tested? Expected / observed behaviorDual path redundant configuration


Related publications

The publications listed in this section are considered particularly suitable for a more detailed discussion of the topics covered in this redbook.

IBM RedbooksFor information on ordering these publications, see “How to get IBM Redbooks” on page 292. Note that some of the documents referenced here may be available in softcopy only.

� IBM SAN Survival Guide, SG24-6143-01

� IBM SAN Survival Guide Featuring the IBM 2109, SG24-6127

� IBM TotalStorage FAStT900/600 and Storage Manager 8.4, SG24-7010

� IBM TotalStorage Solutions for xSeries, SG24-6874

� Introduction to Storage Area Networks, SG24-5470-01

� Server Consolidation with the IBM eServer xSeries 440 and VMware ESX Server, SG24-6852

� IBM eServer xSeries 440 Planning and Installation Guide, SG24-6196

� IBM eServer BladeCenter Networking Options, REDP-3660

� The Cutting Edge: IBM eServer BladeCenter, REDP-3581

Other publicationsThese publications are also relevant as further information sources:

� IBM TotalStorage FAStT Hardware Maintenance Manual, GC26-7640-00

� IBM TotalStorage FAStT Problem Determination Guide, GC26-7642-00

� IBM TotalStorage FAStT Storage Manager Version 8.4x Installation and Support Guide for Intel-based Operating System Environments, GC26-7621-03

� IBM TotalStorage FAStT Storage Manager Version 8.4x Installation and Support Guide for AIX, HP-UX, and Solaris, GC26-7622-02.

� Fibre Channel Hard Drive and Storage Expansion Enclosure Installation and Migration Guide, GC26-7639

� IBM Netfinity Rack Planning and Installation Guide, Part Number 24L8055

Online resourcesThese Web sites and URLs are also relevant as further information sources:

� IBM TotalStorage FAStT Web site:

http://www.storage.ibm.com/disk/fastt/index.html

� IBM TotalStorage 7133 Web site:

http://www.storage.ibm.com/disk/7133/index.html


http://www.storage.ibm.com/disk/fastt/index.html

http://www.storage.ibm.com/disk/7133/index.html

� IBM Personal Computing support site:

http://www.pc.ibm.com/support

� VMware site:


How to get IBM RedbooksYou can search for, view, or download Redbooks, Redpapers, Hints and Tips, draft publications and Additional materials, as well as order hardcopy Redbooks or CD-ROMs, at this Web site:

ibm.com/redbooks

Help from IBMIBM Support and downloads

ibm.com/support

IBM Global Services

ibm.com/services




http://www.ibm.com/support/

http://www.ibm.com/support/

http://www.ibm.com/services/

http://www.ibm.com/services/

http://www.pc.ibm.com/support


Index

Aaccess logical drive 61, 75access LUN 210, 212adapter binding 171adapter cable 22ADT 11, 51, 53, 74, 167alert 8, 83Alert Delay Period 53alert notification 11, 53append 149array 7, 27, 33, 36, 70–71, 153, 155

configuration 34creating 72defragment 80migrating 93number of drives 34size 33, 78

attenuation 20Auto Logical Drive Transfer. See ADTAutomatic Discovery 60

Bbattery 6, 11, 29, 31, 42–43binding 171–172BladeCenter 155–157, 177, 196, 203, 215

Brocade SAN switch module 159disk storage 158features 156Fibre Channel switch module 158Gb Ethernet 160Management Module (MM) 215Optical Passthrough Module 160USB mouse 215, 230

block size 38bond 256, 259BOOTP 11, 58Buslogic 149

Ccable

labeling 23–24management 23routing 24types 19

cabling 19FAStT cabling configuration 62

cache 31, 40block size 41, 44flushing 41, 43hit percentage 79memory 79mirroring 41, 43read ahead 79

© Copyright IBM Corp. 2004. All rights reserved.

read-ahead multiplier 79settings 41

cache_method 110cache_write 111channel protection 34chdev 110–111clock 62, 70, 85cluster 74, 128, 130, 266

configuration 266hybrid 268Microsoft Cluster Server 6Novell cluster 6Veritas Cluster 6

clustering 190, 192Concurrent Resource Manager (CRM) 129connectors 19console OS 146consolidation 152controller ownership 36, 38, 52, 78Copy 4copy 117

FlashCopy 4, 9, 14, 46–47, 81–82, 174Remote Volume Mirroring 14, 17, 44RVM 174services 4VolumeCopy 9, 82, 174

copy priority 82copyback 81Core Dump 146, 242

Ddac 107–108DACstore 93, 95dar 107–108data striping 28–29default group 40, 72–73defragment 80device driver 112direct attached 173discovery 62, 181disk

mirroring 28, 31raw 147, 172thrashing 27

Disk.UseDeviceReset 262DiskUseLunReset 262DMP 50dsk 147Dynamic Capacity Addition 4Dynamic Capacity Expansion 154Dynamic Capacity Expansion (DCE) 81Dynamic Logical Drive Expansion 4Dynamic Logical Drive Expansion (DVE) 81Dynamic RAID Level Migration (DRM) 81

293

Dynamic Reconstruction Rate (DRR) 81Dynamic Segment Sizing (DSS) 81, 154

EESM firmware 90ESX Server 142

architecture 144dsk modes 148features 145

event log 38Event Monitor 8, 83EXP500 5, 67, 92–93EXP700 5, 67extendvg 113–114

Ffabric 14failover 10, 12, 25, 38, 43, 51, 54, 166, 168, 262

policy 167failover alert delay 53–54Fast!UTIL 199, 213FAStT

evolution 4MSJ 40utilities 8

FAStT100 5FAStT200 5FAStT600 5FAStT600 Turbo 5FAStT700 5FAStT900 6FC Switch Module (FCSM) 198FCSM 198feature key 44fget_config 108Fibre Channel 13firewall 58firmware 61–62, 88, 90FlashCopy 9, 14, 46–47, 81–82, 174

logical drive 10floor plan 18flushing 41, 43frame switch 14free space node 80

GGBIC 20Gigabit Interface Converters 20gigabit transport 20GPFS 134Graphical Installer 215, 230guest OS 284

HHACMP 6, 127–128HANFS 129HBA

configuration for VMware ESX Server 199

hdisk 107–108, 111heterogeneous host 39, 73High Availability Subsystem (HAS) 129host agent 60, 62, 83host group 40, 73–74, 210, 213host port 40, 73, 210host ports 213host type 72, 206hot_add 8, 75hot-add 8hot-scaling 93, 96hot-spare 71, 81

global 71hot-spare drive 44HS20 155, 158, 177, 252HS40 155, 158hub 14hybrid cluster 190, 282, 285Hyper-threading 227

IIBM SAN File System 39IBM ServerProven 155IDE 228in-band 59, 70initial discovery 60inter-disk allocation 49inter-switch link 14, 183IOPS 38IP address 59ISL 14, 46

JJFS 47–48

Llabeling 23large I/O size 10LC connector 21LC-SC 22LDM 49LILO 229Linux 51, 75, 89LIP 62, 200LNXCL 206, 210, 212, 268load balancing 27, 49, 52load sharing 26logical drive 35, 155

base 10creating 72FlashCopy 10mapping 40primary 9, 37, 45, 81secondary 9, 37, 45, 81

logical drive transfer alert 53logical partition 113logical volume 113Logical Volume Manager (LVM) 135


loop 14lsattr 108lsdev 108LSI Logic controller 149–150LUN 35

assignment 36discovery 184mapping 40masking 39, 73sharing 211

LVM 47–48

MManagement Interface 224, 262mapping 74memory image 147microcode 61, 72migratepv 114migration 93, 120, 124Migration Director 121–122mini hub 67mini-hub 182mirroring 28, 31mirrorvg 114mkvg 113modal dispersion 19modification operation 80modification priority 80monitoring 36mru 167MSCS 192, 285MSJ 40multi-mode fiber (MMF) 19multipath driver 51My support 88

NNAS storage 173netCfgSet 59netCfgShow 59network adapter

pair 176virtual 175

Network Label 176, 252, 259network parameters 11node failover 128nodeset 134node-to-node 14nonpersistent 148

149NVSRAM 11, 61–62, 206

OODM 110out-of-band 59, 70

PParallel System Support Programs (PSSP) 135

partition 237password 70PCI slots 25performance 27–28, 33, 37, 42–43performance monitor 38, 42, 78persistent 148–149Persistent reservations 9physical volume 112Piper 120planning 13point-to-point 14polling interval 78power cycle 62preferred controller 36, 38, 51, 54, 166, 185prefetch_mult 110premium feature 4premium features 44profile 87

Qqueue depth 111Quorum 277

RRAID 4, 113, 153

controller 6level 7, 28, 32, 72, 79reliability 32

Raw disk 173raw disk 147, 172, 174, 266, 278raw disks 268RDAC 8, 11, 26, 61, 74, 106, 166read caching 42Read Link Status Diagnostic, see RLSread percentage 79read-ahead multiplier 41–42, 72Recovery Guru 54Recovery Profile 11Redbooks Web site 292

Contact us xixredo.log 149reducevg 113redundancy 65Redundant Disk Array Controller, see RDACReliable Scalable Cluster Technology (RSCT) 128Remote Deployment Manager (RDM) 215Remote Technical Support 143Remote Volume Mirroring 14, 17, 44, 80, 182RLS 11rmlvcopy 114round-robin 52rpd 136RVM 174RVM, see Remote Volume Mirroring

SSAN 12–13

boot from 173

Index 295

SATA 5SC connector 21script 62SCSI 14, 113SCSI locking 148SCSI reservations 155segment size 38, 72, 155Serial ATA 5serial connection 58serial port 58Service Alert 4, 83, 85Service Console 164, 227, 234SFF 20SFP 20, 67shared VMFS 191shutdown 93single mode fiber (SMF) 19Small Form Factor Transceivers 20Small Form Pluggable 20SMclient 60, 105SMdevices 8SMruntime 105SMTP 83SNMP 83spare 31split virtual cluster 282split virtual machine 267SSA 113Startup Profile 224, 227, 235Storage Area Network, see SANstorage bus 12Storage Manager 7

Agent 8Client 7

storage partition 7storage partitioning 36, 39, 73, 208stripe kill 33sub-disk 50subsystem profile 87sundry drive 94support

My support 88Surveyor 121Swap File 146swap file 228, 235, 242, 244, 247swap space 226switch 14

ID 92synchronization 45, 81–82

Tthrashing 168throughput 34, 38, 41, 43Tivoli SANergy 39Total I/O 78transceivers 20transfer rate 34trashing 27tray ID 92trunking 14

tuning 27

Uundoable 148–149unmirrorvg 114upgrade 90user profile 84userdata.txt 84–85utilities 8utilitiy

hot_add 8utility

SMdevices 8

VVeritas Volume Manager 50virtual disk 275virtual ethernet switch 175virtual machine

create 273virtual machine cluster 192–193virtual network 176, 250Virtual SCSI Node 278Virtual SMP option 219virtual switch 175–176, 252VirtualCenter 142virtualization 146VLAN 178, 250, 259VMFS 147, 172–174, 180, 228, 242

access modes 148independent VMFS 189partition 265, 270public VMFS 189shared VMFS 189spanning 267

VMFS sharing 208vmfsktools 148vmkfstools 149, 165, 245, 272vmkpcidivy 253vmnet 176vmnic 176VMotion 142, 177, 180, 190, 192, 208, 265–266VMware

ESX Server 142–143GSX Server 142–143P2V Assistant 142VirtualCenter 142Workstation 142–143

VMware ESX Server File System. See VMFSvmxnet_console module 255, 257volume 35volume group 112VolumeCopy 9, 80, 82, 174VxVM 50

WWizard

Create Copy 9


World Wide Name (WWN). See WWNwrite caching 43write_cache 110write-back 43write-through 41, 43WWN 16, 40, 74, 199, 209, 212

XX-Architecture 156

Zzone 15–16

types 16zoning 15, 187, 194

Index 297

(0.5” spine)0.475”<

->0.873”

250 <->

459 pages


IBM TotalStorage FAStT Best

Practices


Practices



Practices


Practices

®

SG24-6363-00 ISBN 0738491330

INTERNATIONAL TECHNICALSUPPORTORGANIZATION

BUILDING TECHNICAL INFORMATION BASED ON PRACTICAL EXPERIENCE

IBM Redbooks are developed by the IBM International Technical Support Organization. Experts from IBM, Customers and Partners from around the world create timely technical information based on realistic scenarios. Specific recommendations are provided to help you implement IT solutions more effectively in your environment.

For more information:ibm.com/redbooks


FAStT concepts and planning

Implementation, migration, and tuning tips

VMware ESX Server support

This IBM Redbook represents a compilation of best practices for configuring FAStT and give hints and tips for an expert audience on topics such as GPFS, HACMP, Clustering, VMWare ESX Server support, and FAStT migration. It is an update and replacement for the IBM Redpaper, REDP3690.

Part 1 provides the conceptual framework for understanding FAStT in a Storage Area Network and includes recommendations, hints, and tips for the physical installation, cabling, and zoning. Although no performance figures are included, we discuss the performance and tuning of various components and features to guide you when working with FAStT.

Part 2 presents and discusses more advanced topics, including a technique for migrating from 7133 SSA disks to FAStT, High Availability Cluster Multiprocessing (HACMP) and General Parallel File System (GPFS), in an AIX environment, as they relate to FAStT.

Part 3 is dedicated to the VMware ESX 2.1 Server environment and provides substantial information for different configurations and attachment to FAStT.

This book is intended for IBM technical professionals, Business Partners, and customers responsible for the planning, deployment, and maintenance of IBM TotalStorage FAStT products.

Back cover