improving disk latency and throughput with vmware presented by raxco software, inc. march 11, 2011

34

Upload: garey-baldwin

Post on 12-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011
Page 2: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Improving Disk Latency and Throughput with VMware

Presented byRaxco Software, Inc.

March 11, 2011

Page 3: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Today’s Agenda

• Provide technical information on how NTFS impacts VMware I/O performance

• Examine ESX I/O test results • Economic impact of Windows guests• Solutions

Page 4: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Virtualization Benefits

• Server consolidation• Less physical space for data centers• Lower energy costs• Easier management• Eco-friendly alternative

Page 5: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Identifying and Correcting Problems

• Latency is your best indicator of a performance problem– Device latency is vSphere’s report of the physical storage response time

– Kernel latency is vSphere’s report of ESC’s ability to manage IO

• Experts disagree on specifics, but most agree that… Device latency in excess of 15ms is worth inspection

Device latency in excess of 30ms is likely a problem

Kernel latency in excess of 2ms means ESX queues are overflowing

• High device latency can result in ESX queuing– So, correct slow hardware first!

– Then, consider reducing VMDKs on a VMFS volume

– Only then consider changing queue depths

© Copyright 2010 EMC Corporation. All rights reserved.

Page 6: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Storage Contention Solution: Storage IO Control

• SIOC calculates data store latency to identify storage contention– Latency is a normalized, average across

virtual machines

– IO size and IOPS included

• SIOC enforces fairness when data store latency crosses threshold– Default of 30ms

– Fairness enforced by limiting VMs access to queue slots

• Net effect: trade throughput for latency

© Copyright 2010 EMC Corporation. All rights reserved.

With Storage IO ControlActual Disk Resources utilized by each VM are in

the correct ratio even across ESX Hosts

Page 7: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

NTFS I/O Storms

Page 8: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

NTFS Behavior

• NTFS fragments files and free space• Increases logical I/O to storage controller• More logical I/O = More physical I/O • Multiple instances of Windows on host can

lead to I/O contention

Page 9: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

What is Fragmentation?

??

??

?

?

?

? ?

?

?

?

?

?

??

??

? ?

? ??

??

?

?

?

?

?

Page 10: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Logical v Physical

• Logical Level – NTFS needs disk and cluster size,

enumerates LCNs– Creates $MFT and $Bitmap

metadata– $Bitmap is how NTFS “sees”

the disk – Has no idea about physical/virtual

disk types

Page 11: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Anatomy of an MFT Record

(vcn, lcn, run length): (8a85, 9189a, 7)(vcn, lcn, run length): (8a85, 9189a, 7)

Page 12: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

File Allocation

• Create $MFT record (one or more)• $Bitmap accessed to locate free space • $MFT record is updated with content

CreateBitmapAccess

MFTUpdate

Page 13: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

File Access

• Load portion of MFT with correct record via directory

• Locate file in the MFT• Pass starting LCN’s and run lengths to disk

controller• Number of logical fragments influences

number of physical seeks

Load LocateFile

# LCN’s # PhysicalSeeks

Page 14: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Logical v. Physical

• Physical Level– Disk controller Maps LCNs to PCNs– Writes data to disk

Page 15: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Wasted Seeks

Partition State

Total Number of I/O Requests Sent to the File

System

Total Number of Resulting

Disk Accesses/Seeks

Net Wasted Seeks When

Running SYSmark

Percent Net Wasted Seeks When Running

SYSmark

Fragmented 1,320,686 2,090,649 769,963 58.30%

After PerfectDisk

1,434,454 1,616,847 182,393 12.72%

After Built-In 1,411,613 1,931,395 519,782 36.82%

Page 16: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

How This Affects A Virtual Environment

• P2V Conversion• Extra Hypervisor Overhead• Disk Latency Degradation• Overall Performance • System Throughput• Wasted Space• Costly

Page 17: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

P2V Conversion

Physical Drive

No Optimization Optimization

24GB 24GB 22GB 2GB Smaller

Page 18: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

ESX Cluster Testing

• Identical disks - 40% free space• Optimized one set, the other “as is”• Installed MS Office and MS SQL• Captured metrics with VMware’s vscsiStats

utility

Page 19: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Fragmented PerfectDisk % Improvement

Total IO Count 37191 29238 21.3

Read IO Count 3066 2799 8.7

Write IO Count 34125 26439 22.5

Total I/O Count

Page 20: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

30ms 50ms 100ms >100ms Total

Fragmented

I/O 12749 9877 8700 9116 40,442

PerfectDisk I/O

6707 4923 4081 5053 20,764

49% Reduction in Latency!

Page 21: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Disk Latency

Page 22: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Fragmented Disk PerfectDisk Disk

Total IO Equal to 524K 2512 848

Total IO > 524K 247 2959Read IO Equal to 524K 33 7

Read IO >524K 125 65

Write IO Equal to 524K 2480 841

Write IO >524K 122 2894

12X More Large I/O

Page 23: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

12 times more of the largest IO

Large I/O

Page 24: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Improved Sequential I/O

Fragmented PerfectDisk Improvement

Percent Sequential 17% 27% 58%

Total IO 127703 90526 25%

Sequential IO 22126 24340 33%

Page 25: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Improved Sequential I/O

Page 26: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Installation Time Comparison

Fragmented PerfectDisk % Improvement

MS Office Install 20 min 15 min 25

MS SQL Install 76 min 51 min 33

Page 27: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

The Cost of Fragmentation

EXAMPLE:

• 20 files x 6 seconds = 2 minutes• 300 users x 2 min = 10 hours/day• 10 hrs x $25/hr = $250/day• Annual cost = $62,500

Page 28: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Virtual Guest Fragmentation

• Windows guests have all the same NTFS behavior

• Fragmentation produces more IOPS• Fragmentation reduces ESX throughput• Fragmentation increases ESX disk latency• Fragmentation creates resource contention between

host & guests

Page 29: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Solutions

• Expensive– More disks and faster disks– Upgrade Fibre Channel– Troubleshooting

• Inexpensive– Optimize the Windows guest systems

Page 30: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

PerfectDisk 12 vSphere

• Virtualization Awareness/host & client• OptiWrite Fragmentation Avoidance• “Zero-fill” free space

NEWNEW

NEWNEW

NEWNEW

Page 31: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

PerfectDisk 12 vSphere

• “Short stroking” for thin provisioned disks• Schedule guest compaction• Snapshot & Linked Clone recognition

NEWNEW

NEWNEW

NEWNEW

Page 32: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

PerfectDisk Benefits on ESX

• Saves $$$ in productivity and admin• Reduces resource contention for VM’s• Reduces total IO workload• Improves throughput• Reduces disk latency• Delivers optimal performance

Page 33: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011

Contact Raxco

• Free Evaluation Software• Excellent Support to Get You Started• White Papers • Great ROI• www.raxco.com • Toll Free: 1.800.546.9728

Page 34: Improving Disk Latency and Throughput with VMware Presented by Raxco Software, Inc. March 11, 2011