or distribution - rainfocus | the world’s only insight …...or distribution • this presentation...

44
Speaker(s) SER2849BU Sai Inabattini Extreme Performance Series: Predictive DRS - Performance and Best Practices VMworld 2017 Content: Not for publication or distribution

Upload: others

Post on 05-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Speaker(s)

SER2849BU

Sai Inabattini

Extreme Performance Series: Predictive DRS -Performance and Best Practices

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

#SER2849B CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Case 1

• Some VMs suffer briefly because of periodic resource usage surges in other VMs in my cluster – application performance drop

Case 2

• I tend to reserve capacity for VMs based on their peak load, even when their average loads are much lower – inefficient resource usage

Problem: Occasional Resource Contention

#SER2849B CONFIDENTIAL 3

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Addressing Resource Contention

1. Reactive

– Move VMs when the contention happens

– Benefits: Minimal overhead, only move VMs that must be moved

– Performance impact: VMs may suffer briefly, since remediation happens after contention starts

2. Proactive

a) Statically reserve more resources

• Performance impact: No impact, but overprovisioning resources is not good

b) Learn workload pattern, move VMs before resource demand spike

• Performance impact: No impact for regular, periodic workloads

• Cannot handle unexpected short term resource demand spikes

• Cannot predict events that trigger load imbalance

#SER2849B CONFIDENTIAL 4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

What Is the Best Solution?

Good balance of both approaches:

Predicting future demands + Reacting to current and future demands

This is predictive-DRS (pDRS)

• New in vSphere 6.5 and vROPs 6.4

#SER2849B CONFIDENTIAL 5

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Speaker(s)

SER2849BU

Sai Inabattini

Extreme Performance Series: Predictive DRS -Performance and Best Practices

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Agenda

1 Introduction to pDRS

2 Software requirements and Configuration

3 Application performance case studies

4 Frequently Asked Questions

5 Conclusion

#SER2849B CONFIDENTIAL 7

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

IntroductionWhat is pDRS?

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

What Is Predictive DRS?

• DRS enabled with predictions

• Powerful resource scheduling of DRS + Predictive analytics of vROPs

• Uses both reactive and proactive approaches to balance the workload distribution

vSphere DRS

vRealize Operations

p

#SER2849B CONFIDENTIAL 9

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

How Does pDRS Work?

Resource usage data

from vCenter

Predictions

Recommendations

DRS/vCenter

vROps

#SER2849B CONFIDENTIAL 10

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

vROps Dynamic Thresholds (DT)

• Sophisticated Analytics – 10 different algorithms

• Learns Normal Behavior for every metric for every object

• Detects Hourly, daily, monthly patterns

• Generates Upper and Lower bound of “normal” called Dynamic Thresholds (DT)

• This DT is polished and filtered to generate predictions

#SER2849B CONFIDENTIAL 11

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Software Requirements and Configuration

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Software Requirements

• vSphere 6.5 (Enterprise+)

• vRealize Operations Manager (vROps) 6.4 and newer

– vROps 6.4: Supports up to 4000 VMs in a cluster

– vROps 6.5 and above, No limits on vROps side

vROps + vSphere 6.5(E+) = pDRS is Free

#SER2849B CONFIDENTIAL 13

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Configuration - vSphere

• Enable predictive DRS in the vCenter server

– cluster → configure → vSphere DRS

• Make sure that the clocks in vCenter server and vROps are synced to within a few minutes.

Note: If the clock skew is > 5 Mins, vCenter server discards the predictions

#SER2849B CONFIDENTIAL 14

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Configuration – vROps

Enable vCenter adapter to provide stats to pDRS

#SER2849B CONFIDENTIAL 15

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Application Performance Case Studies

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Test Scenario

• “Follow the sun” model

Type A – 8 hours

Type B – 8 hours

Type C – 8 hours

Type B starts

Type A ends

#SER2849B CONFIDENTIAL 17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Detecting Workload Surges

• Two cases

– Impact of predictions on the Type A workloads (already running)

– Impact of predictions on the Type B workloads (about to start)

• Test Benchmark: DVDStore

– Benchmark tool that simulates an online store that sells DVDs

– OLTP Database test workload

– Workload is CPU intensive

– Throughput is recorded in transactions per minute

• Test VMs

– Windows VMs

– DVDStore VMs

#SER2849B CONFIDENTIAL 18

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

• Initial State

Impact on Type A Workloads

DVDStore VMs (Type A)

Idle Windows VMs (Type B)

#SER2849B CONFIDENTIAL 19

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type A (contd.)

Predictions from vROps for Type B VMs

#SER2849B CONFIDENTIAL 20

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 21: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

• DRS Recommendations (due to predictions for Type B VMs)

Impact on Type A (contd.)

pDRS enabled

pDRS disabled

#SER2849B CONFIDENTIAL 21

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type A (contd.)

• Final state after workload surge

With pDRS - Disabled

With pDRS - Enabled

#SER2849B CONFIDENTIAL 22

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type A (contd.)

• Application performance

Type B workload starts

DRS remediates the Imbalance

#SER2849B CONFIDENTIAL 23

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type B (Newly Started Workloads)

• Initial State

Idle DVDStore VMs (Type B)

Windows VMs (Type A)

#SER2849B CONFIDENTIAL 24

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type B (contd.)

• Predictions from vROps for Type B VMs

#SER2849B CONFIDENTIAL 25

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

• DRS Recommendations

Impact on Type B (contd.)

pDRS enabled

pDRS disabled

#SER2849B CONFIDENTIAL 26

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Impact on Type B (contd.)

• Final state after workload surge

pDRS disabled

pDRS enabled

#SER2849B CONFIDENTIAL 27

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

DVDStore – Effect of load during Application startup

• Application Performance

Impact on Type B (contd.)

Application Starts

DRS remediates load imbalance

Workload stabilizes

#SER2849B CONFIDENTIAL 28

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

II. Distributed Power Management (DPM) with Predictions

• DPM is the cluster level power management engine that provides additional power savings

• Dynamically consolidates workloads during periods of low resource utilization

• Migrates Virtual machines onto fewer hosts and the un-needed ESX hosts are powered off

• When workload demand increases, ESX hosts are powered back

#SER2849B CONFIDENTIAL 29

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

DPM with Predictions (contd.)

#SER2849B CONFIDENTIAL 30

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Distributed Power Management with Predictions (contd.)

#SER2849B CONFIDENTIAL 31

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Frequently Asked Questions

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Workloads that pDRS Can Predict

• Any type of workload with periodic usage pattern

• Short spikes in the order of minutes will not be predicted

• More consistent the workload is, more accurate the predictions will be

#SER2849B CONFIDENTIAL 33

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Learning Period

• The default learning period is a minimum of 14 days for generating predictions

• Longer the learning period, better the accuracy of predictions

• Predictions will be available only after 14 days of data

#SER2849B CONFIDENTIAL 34

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Current Demand vs Future Demand

• pDRS will always ensure current VM demand will not be affected due to future demand

• VM demand = Max(Current demand, Future demand)

• Current demand of VMs will never be clipped in favor of future demand

#SER2849B CONFIDENTIAL 35

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Tuning

• Compute Dynamic Thresholds

– You can manually force vROps to collect data for calculating Dynamic Thresholds (DT)

– “Administration” → “Support” → “Dynamic Thresholds”

• Look ahead interval

– Amount of time DRS looks ahead while accounting predictions

– Default value is 1 hour

– Use DRS advanced option ProactiveDrsLookaheadIntervalSecs to change

– Max allowed value is 12 hours

#SER2849B CONFIDENTIAL 36

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

VMs with Predictions vs VMs without Predictions

• How will pDRS behave when I have a mix of VMs (some with predictions, and some without) in the same cluster?

• VMs with predictions,

– VM demand = Max (Current demand, Future demand)

• VMs without predictions,

– VM demand = Current demand

#SER2849B CONFIDENTIAL 37

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Filtering Predictions in the vROPs

• Once a day, vROps sends the next 26 hours of predictions to VC

• For 26 hours, there will be 52 samples, 1 sample for every 30 minutes

• Prediction samples that do not meet the accuracy criteria will be discarded and set to a value -1

• Consecutive identical samples will be merged to send a single multi-hour sample

#SER2849B CONFIDENTIAL 38

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

• How can I differentiate DRS vMotions due to predictions?

• Resource demand on Host A increased due to increased demand in green and red VMs

• DRS moved blue VMs to Host B to balanced the cluster load

• In this case, DRS chose blue VMs as moving them will balance the cluster faster

Host A

Identify vMotions Due to Predictions

VMs without predictions

VMs with predictions

VMs with predictions dropped

Host B

#SER2849B CONFIDENTIAL 39

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Conclusion

• pDRS can help avoid contention before the performance of a VM degrades

• Forecasting in pDRS works best for VMs with periodic workload patterns

• Current demand will never be clipped to favor future demand

• Provides the best solution through Reactive + Predictive approach

#SER2849B CONFIDENTIAL 40

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

DRS Flings

• DRS Lens

– Provides a simple, yet powerful interface to highlight the value proposition of vSphere DRS

– https://labs.vmware.com/flings/drs-lens

• DRS Dump Insight

– Service portal where users can upload drmdump files and it provides a summary of the DRS run

– https://labs.vmware.com/flings/drs-dump-insight

#SER2849B CONFIDENTIAL 41

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 42: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

Other Performance Sessions

• vCenter Performance Deep Dive [SER1504BU]

• Extreme Performance Series: Performance Best Practices [SER2724BU]

• Extreme Performance Series: vSAN Performance Troubleshooting [STO1515BU]

• Maximum Performance with Mark Achtemichuk [VIRT2368GU]

#SER2849B CONFIDENTIAL 42

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: or distribution - RainFocus | The world’s only insight …...or distribution • This presentation may contain product features that are currently under development. • ... •

VMworld 2017 Content: Not fo

r publication or distri

bution