vmworld 2015: conducting a successful virtual san proof of concept

49
Conducting a Successful Virtual SAN Proof of Concept Cormac Hogan, VMware, Inc Julienne Pham, VMware, Inc STO4572 #STO4572

Upload: vmworld

Post on 08-Feb-2017

222 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Conducting a Successful Virtual SANProof of ConceptCormac Hogan, VMware, IncJulienne Pham, VMware, Inc

STO4572

#STO4572

Page 2: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

CONFIDENTIAL 2

Page 3: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 3

Page 4: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

4

Agenda

1 Introduction to STO4572 Session

2 Introduction to Virtual SAN

3 Initial consideration for a proof of concept on VSAN

4 Tools available to conduct a successful proof of concept

5 POC validation scenarios

6 Measuring Performance

7 Moving from POC to Production

CONFIDENTIAL

Page 5: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

5

This Session…• Virtual SAN has been available for 18 months

• VMware recognizes that conducting a Virtual SAN proof of concept can be challenging

• Since the launch of Virtual SAN, additional tools for managing, monitoring and troubleshooting Virtual SAN have become available

• In this session, the tools available to vSphere and Virtual SAN administrators will be discussed, and how they can help deliver a Virtual SAN proof of concept

• The session will also cover considerations of moving Virtual SAN from POC to production

CONFIDENTIAL

Page 6: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Unprecedented Customer Momentum

2000+ Customers in the first 15 months

In my experience VMware solutions are rock solid…we’re ready to nearly double our VSAN deployment.

“”

It really did work as advertised…the fact that I have been able to set it and forget it is huge!

“”

CONFIDENTIAL 6

Page 7: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

7

Introduction to VMware Virtual SAN• Storage scale out architecture

built into the hypervisor• Aggregates locally attached storage

from each ESXi host in a cluster• Dynamic capacity and

performance scalability• Flash optimized storage solution • Fully integrated with vSphere and interoperable:

• vMotion, DRS, HA, VDP, VR …• VM-centric data operations

+ + + ++ + +

+

DatastoreVirtual SAN

CONFIDENTIAL

Page 8: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Proof of Concept ConsiderationsBefore you start …

Page 9: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Before Considering a Virtual SAN PoC

Accelerate Use Case

Planning Outcome

CONFIDENTIAL 9

Page 10: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

10

Organization Challenges

Culture Barrier• The fear about what

you do not know and the lack of control and visibility

Storage team operations• New methodology• New way to see

things and operate• Converged compute

and storage

Support• Single Point

of Contact• No vendor

finger pointing

CONFIDENTIAL

Page 11: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

11

Technical Requirements

• EVO:RAIL, VSAN Ready Node or Do-it yourself• Uniform configuration

Hardware

• Shared Network VS Dedicated• Distributed Switch VS Standard• Multicast

Networking

• Controller choices• RAID0 VS Pass-through• SSD/HDD Ratio Choices• Performance VS Endurance• SAS Expanders

Storage

CONFIDENTIAL

Page 12: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

What I Need to Be SuccessfulTools to conduct a successful Virtual SAN POC

Page 13: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

13

Success Tool #1: Health Plugin• Introduced with Virtual SAN 6.0

• Incorporate in the vSphere Web Client

• Virtual SAN Health Check tool include:– General Health– Proactive tests– Virtual SAN HCL health– Physical disk health

• Especially useful to observe injected errors and verifying that they have been remediated

CONFIDENTIAL

Page 14: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

14

Success Tool #1: Health Plugin

• Proactive tools running on Virtual SAN cluster and pre-production tests– VM Creation test– Storage Load test– Multicast Performance test

CONFIDENTIAL

Page 15: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Success Tool #2: RVC/Virtual SAN Observer• Native tools installed on VCSA and on VC Windows

• Used for Configuration and Status of the Virtual SAN Cluster

• For Performance and Activity monitoring on demand– VM level– Host level– VMDK level– HDD/SSD Level

• Any anomalies will show up with the metric in question shown in red

CONFIDENTIAL 15

Page 16: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

16

Success Tool #2: RVC/Virtual SAN Observer

vsan.apply_license_to_cluster

vsan.enable_vsan_on_cluster

vsan.disable_vsan_on_cluster

vsan.clear_disks_cache

vsan.cluster_change_autoclaim

vsan.cluster_set_default_policy

vsan.enter_maintenance_mode

vsan.fix_renamed_vms

vsan.object_reconfigure

vsan.host_wipe_vsan_disks

vsan.recover_spbm

vsan.reapply_vsan_vmknic_config

Cluster

vsan.check_limits

vsan.check_state

vsan.cluster_info

vsan.cmmds_find

vsan.whatif_host_failures

vsan.resync_dashboard

Diskvsan.disk_object_info

vsan.disks_info

vsan.disks_stats

Hostvsan.host_info

vsan.host_consume_disks

Networkingvsan.lldpnetmap

VMvsan.vm_object_info

vsan.vm_perf_stats

vsan.vmdk_stats

vsan.obj_status_report

vsan.object_info

Troubleshootingvsan.support_information

vsan.observer

Virtual SAN Operation Virtual SAN Information

Virtual SAN Monitoring

CONFIDENTIAL

Page 17: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

17

Success Tool #3: Virtual SAN Pack for vROps• Integrate to the comprehensive vSphere monitoring software vRealize Operations 6.0.1

• Available on Advanced or Enterprise Edition

• Collect SSD/HDD disk performance across the cluster

• Collect SMART information

• Monitor information across multiple level : – disk group– host– cluster– datacenter

CONFIDENTIAL

Page 18: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

18

Custom DashboardsIn the VSAN cluster• Disk Group Throughput • SSD/MDs Information• Capacity Usage by hosts

CONFIDENTIAL

Page 19: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 19

Success Tool #4: Log Insight• Built-In with VMware - vSphere

• Troubleshooting tool

• Logging Analytic tools

• Any Virtual SAN failure can be correlate between hosts and disk groups

• Track Virtual SAN operations

Storage – VSAN view

Storage – VSAN Interactive Analytic view

Page 20: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Validation ScenariosExpected outcomes from various activities

Page 21: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

PoC Validation• What are the most important test validation?

1. Successful VSAN configuration2. Successful VM deployments on VSAN datastore3. VM Availability in the event of failures (host, storage device, network)4. VSAN serviceability5. VM Performance meets expectations

CONFIDENTIAL 21

Page 22: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 22

Case #1 – Successfully Deploy VSAN• Ensure correct vSphere versions

• Appropriate licenses are available (if PoC is going to take a long time)

• Ensure network is in place. Remember multicast requirement, so prep the network team.

• Minimum of three servers.

• Minimum of three servers contributing storage:– At least one storage controller – check the HCL, verify drivers and firmware are valid– At least one flash device (SSD, PCIe) for cache – make sure these are on HCL– At least one magnetic disk or flash device for capacity – check the HCL– Or consider VSAN Ready Nodes as an option …

Remember, the VSAN Health Check will do most of this work for you

Page 23: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 23

Case #1 – Successfully Deploy VSAN

Run this after every test!

Also use it to make sure you

fixed the problem you previously

introduced!

Check the Virtual SAN Health Check plugin regularly

Page 24: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 24

Case #2: Successful VM Deployment

Use the Health Check to do initial VM deployment check

Part of the Proactive Tests. This will verify if VMs can be created

on VSAN cluster

Page 25: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 25

Case #2: Successful VM Deployment

I created a new VM, but I am not sure where the VM is stored

Component host location

Page 26: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 26

Case #3: VM Availability in the Event of Failures• There are various failures that may be introduced as part of a typical POC

– Host failure– Flash device / Magnetic Disk failure – Cache/Capacity failures– Network failure

• The primary objective is to ensure that the VM continues to be available in the event of a failure. This might mean the VM is restarted on another node in the cluster.

• vSphere HA also has a role to play here. It is integrated with Virtual SAN.

Page 27: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 27

Case #3.1: Host Failures• How many hosts do I really need?• A minimum of 3 hosts is needed to support VSAN

• What about rebuilding after a failure or maintenance mode operations?

• If you want virtual machines to remain highly available on VSAN during these scenarios, consider configuring for additional capacity i.e. minimum 4 nodes

Page 28: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 28

Case #3.2: Storage Failures

• The Virtual SAN 6.0 Proof Of Concept Guide has details on how to inject temporary disk errors for the purpose of testing– A real disk failure results in immediate rebuild activity initiated by VSAN

Eject/Offline/Unplug: AbsentWait 60 minutes

before remediation

Failure: DegradedImmediate remediation

Page 29: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 29

Case #3.3: Network Failure

Part of the Proactive Tests. This will verify if multicast

performance is acceptable can for VSAN cluster

Multicast configuration is the most common issue

Page 30: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

30

Case #3.4: Validating Rebuild Activity after Failure• Virtual SAN might need to move data around in the background: change policy, host failure, long

term/permanent component loss, user triggered reconfig, maintenance mode, etc.

• UI Resync Dashboard shows the VMs that are resyncing and remaining bytes to sync

Remember! Test one thing at a

time!

CONFIDENTIAL

Page 31: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 31

Case #4: VSAN Serviceability

I want to update one of my ESXi host in a VSAN cluster, what do I do?

VSAN provides multiple options for maintenance mode

Page 32: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 32

Case #4: VSAN Serviceability

Ensure Availability Full Data Migration No data MigrationLost of VM compliance Full VM Data compliance No VM availability ensured

Short time maintenance More than one hour of Maintenance

Short time maintenance

Short Storage preparation Long storage preparation No Impact

Limited Free Storage space required

Free Storage space requirements on the other nodes

No Impact

Page 33: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Case #4: Management – Disks ServiceabilityDisk serviceability feature enables identification of to be replaced magnetic disks and flash based

CONFIDENTIAL 33

Page 34: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

34

Case #4: Management – Disk/Disk Group Evacuation

• Allows you to evacuate data from disk groups and individual disks before removing a disk/disk group from a Virtual SAN host

• Allows Virtual SAN to ensure all workloads stay fully compliant with their policy!• Supported in the UI, ESXCLI and RVC

• Check box in the “Remove disk/disk group” UI screen

CONFIDENTIAL

Page 35: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

How to Measure Virtual SAN Performance?

Page 36: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

How to Test Performance…• The distributed architecture of VMware Virtual SAN dictates that reasonable performance is

achieved when the pooled compute and storage resources in the cluster are well utilized

• This usually means a number of VMs each running the specified workload should be distributed in the cluster and run in a consistent manner to deliver aggregated performance

• This part of an evaluation can be complex and time-consuming

• Real application workloads are best, but …– synthetic workloads (IOmeter) might be easier to set up– simplistic workloads don’t really reflect what Virtual SAN can do

• Worth a read: Pro Tips For Storage Performance Testing– http://blogs.vmware.com/storage/2015/08/12/tips-storage-performance-testing/

CONFIDENTIAL 36

Page 37: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 37

Performance Testing Considerations

Is the test utilizing the distributed storage resources of Virtual SAN?

• Multiple VMs across multiple hosts will deliver better performance than a single VM on one host

Is the working set fully in cache, utilizing flash performance?

• Read-cache misses will incur latency

Is the workload cache friendly?

• Sustained sequential write workloads fill cache, which must then be destaged. Mixed R/W workloads are best

Is the cache warmed?

• Initial results from starts of tests will not be reflective of overall performance

Page 38: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Performance Considerations• Application

– Single vs. multiple workers– Working set size – is it all in cache?– Sequential workloads versus random workloads – cache friendly workload?– Outstanding I/Os – have you a decent queue depth on the storage controllers?– Block size – if synthetic, does it represent the typical application block size?– Guest file system considerations – raw or not?

• VSAN– Cache warm up considerations– Number of magnetic disk drives/striping considerations– Performance during failures and rebuild activity

CONFIDENTIAL 38

Page 39: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 39

Performance Test with IOmeter• Do NOT forget to warm the SSD before your performance test

• First test:– Single worker – < 8 Outstanding I/O – Write I/O Data Pattern will use repeating bytes – 4KB I/O size – 70% Read/30% Write – 100% Random

• Consider moving, over time, to:– multiple workers – multiple VMs – multiple hosts– Increasing OIO – latency versus IOPS

Page 40: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 40

Virtual SAN Health Check Plugin – Proactive Storage Tests• Run this performance test in a non-production environment

• It will create ~10-20 VMDKs per host which will be distributed by VSAN onto physical disks and then issue a synthetic IO workload on all VMDKs on all hosts in parallel

• A way to validate IOPs and bandwidth requirements

Page 41: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

From PoC to ProductionDay 2 Operation Considerations

Page 42: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Considerations

HA/DR

Monitoring

Operations

Design for Scaling

• Stretched Cluster• Used of VR/SRM

• Setup Alarms • Used vROps• vSAN Health Plugin

• Maintenance Mode• Workflow• Third Party tools• SSD/HD rebuild

• Script install• Capacity planning

CONFIDENTIAL 43

Page 43: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept
Page 44: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept
Page 45: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

Conducting a Successful Virtual SANProof of ConceptCormac Hogan, VMware, IncJulienne Pham, VMware, Inc

STO4572

#STO4572

Page 46: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 49

Case #4 : Other ways of monitoring VSAN Activity• VSAN Health Check Plugin

– Rerun tests and check if any of the many checks have failed– Any checks that have failure will also generate an alarm (new in 6.1 version)– Link to VMware KB via AskVMware to assist with troubleshooting

• vRealize Operations Management with storage pack for VSAN– Ships with a number of preconfigured dashboards– Surfaces up various events and warning that are specific to VSAN– Provides troubleshooting guidance

• vRealize Log Insight– Examines logs from VSAN events as well as VSAN traces

Page 47: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 50

Case #4 : Monitory VSAN Activity

Number of Virtual SAN Cluster

Virtual Machine ObjectTop Virtual SAN issues

Virtual SAN Alerts

VM Information through vROps

Page 48: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 51

Case #4 : Monitoring VSAN Activity

Magnetic disks used by this Virtual SAN Cluster

Storage Performance

Disk latencies through vROps

Page 49: VMworld 2015: Conducting a Successful Virtual SAN Proof of Concept

CONFIDENTIAL 52

Case #4 : Observing VSAN Activity

Host disconnected from the network

Impact of failure on VSAN, along with recommendations on

what to do next