sto3192bu deploying big data on hci powered by vsan · pdf filedeploying big data on hci...

80
Vahid Fereydouny, Sr Product Line Marketing Manager Sachin Sundar, Product Line Marketing Manager STO3192BU #VMworld #STO3192BU Deploying Big Data on HCI Powered by vSAN 1 VMworld 2017 Content: Not for publication or distribution

Upload: dinhkhanh

Post on 08-Mar-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Vahid Fereydouny, Sr Product Line Marketing Manager

Sachin Sundar, Product Line Marketing Manager

STO3192BU

#VMworld #STO3192BU

Deploying Big Data on HCI Powered by vSAN

1

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 2: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

• This presentation may contain product features that are currently under development.

• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.

• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.

• Technical feasibility and market demand will affect final delivery.

• Pricing and packaging for any new technologies or features discussed or presented have not been determined.

Disclaimer

#STO3192BU CONFIDENTIAL 2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 3: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Agenda

1 vSAN overview and use cases

2 Big Data overview and use cases

3 Big Data on vSAN Performance Assessment

4 MongoDB on vSAN

5 Splunk on vSAN

6 Summary and Q&A

3

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 4: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

vSAN Overview & Use Cases

4

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 5: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

5© 2017 VMware Inc. All rights reserved. Confidential – Not for Distribution

HCI Powered by vSAN Overview

3-Tiered

Architecture

Built on proprietary hardware

Virtualization

Compute

Storage Networking

Storage

Hyper-Converged

Infrastructure

Built on industry-standard hardware

Virtualization

Compute

Storage

Networking

Management

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 6: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Supporting a Broad Variety of Use Cases

Business Critical Apps Virtual Desktops (VDI)

DR / DA

Cloud Native AppsDatabases

(SQL/Oracle)

ROBOManagement

Clusters

ContainersvSAN

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 7: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

vSAN Is Used For Mixed WorkloadsWhat applications do you run on vSAN today?

21%VMware Horizon

Citrix Xen Desktop

VDI

10%

Microsoft SharePoint

Microsoft Exchange Server

Microsoft Applications

15%

26%

NoSQL databases (Cassandra, etc.)

Hadoop and other big data applications

NewUse Cases

9%

3%

Source: TechValidate survey of 316 users of VMware vSAN

Microsoft SQL Server

MySQL Databases

Oracle Databases

SAP

Databases

67%

38%

18%

7%

Growing trend for customers to deploy their

big data on vSAN

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 8: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Tiered All-Flash and Hybrid Options Provides Choice

8

Caching

DataPersistenceVirtual SAN

All-Flash

100K IOPS per Host+

sub-millisecond latency

Writes cached first,Reads from capacity tier

Capacity TierFlash Devices

Reads primarily from capacity tier

SSD PCIe NVMe

Hybrid

40K IOPS per Host

Read and Write Cache

Capacity TierSAS / NL-SAS / SATA

SSD PCIe NVMe

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 9: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

All-in-one HCI Appliance

Dell EMC best-in-breed data protection

Rapid time-to-value with multiple configurations

Single, pro-active vendor support for software and hardware

Fully Customizable HCI

Choice of 5x more server vendors

Software and support flexibility

Backup agnostic to minimize change

HCI Deployments Provide Choice

Dell EMC VxRail Appliances

VMware vSAN ReadyNodes

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 10: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Big Data overview and use cases

10

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 11: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Big Data Analytics – Simplified Landscape

11

Analytics & Visualization

Data Preparation

Big Data Infrastructure Compute Network Storage

Big Data Platform

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 12: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Examples of Big Data Use Cases

12Edge/On-Premises

Cloud OrOn-Premises

Data Collection

Analytics & Machine Learning

SupportDevice/Customer 1

Device/Customer N

Customer

.

.

. Sales/Marketing

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 13: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Some of the Major Customer Challenges in Adoption of Big Data are Related to Infrastructure

13

Some of the Top Pain Points are related to Infrastructure

• Security, scalability, manageability, and Performance top the list of infrastructure pain point

12

4

1

4

1

4

2

17

19

21

16

16

12

14

Security

Data movement across platforms

Better analytic tools

Performance

Easy to scale in/out

Easy to manage

Better reporting tools

Top 1 Pain Point Top 7 Pain Points

Top 10 Pain Points for Hadoop Workloads

n= 43

VMware Internal Focus Groups43 Responses from different companiesBased on 6 Focus Groups in Europe and U.S

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 14: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Big Data Implementations in Silo are Adding Major Cost and Complexity

14

On-Prem

Public Cloud

On-Prem On-Prem Amazon GoogleOther

Google Cloud Platform

…Physical Servers

On-Premises

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 15: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Public Cloud

Our Long Term Vision is to Break Those Silos

15

Storage & Availability

Hyper-Converged Software

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 16: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Major Benefits of Virtualizing Big Data

16

Simplified

Management

Centralized data center management

Apply virtualization best practices

EfficiencyResource pooling

Server and cluster consolidation

AgilityInfrastructure on demand

Sharing of physical resources – not dedicated clusters

PerformanceEqual to, or better performance than native Hadoop

No significant overhead

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 17: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Big Data on vSAN + Intel Hardware

17

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 18: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Worker Node 1 Worker Node 2 Worker Node 3

The Existing Hadoop Architecture

ResourceManager

Client

Datanode

Nodemanager

AppMaster - 1

Nodemanager Nodemanager

Datanode Datanode

HDFS Block 1 HDFS Block 2 HDFS Block 3

Container - 2 Container - 3

Master File System Index

NameNode

submit job

Workers

Master Scheduler

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 19: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Worker Node 1 Worker Node 2 Worker Node 3

Input File

Hadoop – in Virtual Machines

ResourceManagerJob

Datanode

Nodemanager

Split 1 – 64MB

AppMaster - 1

Split 2 – 64MB

Split 3 – 64MB

Nodemanager Nodemanager

Datanode Datanode

Block 1 – 64MB Block 2 – 64MB Block 3 – 64MB

Container - 2 Container - 3

Namenode

Master Roles

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 20: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Hadoop Deployment – Virtualized on vSphere + vSAN

vSAN Datastore

vSphere + vSAN

HDFS+YARN

SQLIn-Memory Map-Reduce NoSQL Stream Search Custom• Security: Native in vSphere and

vSAN

• Scalability: Scale-up/Scale-Out Architecture, Mixing of workloads in the same cluster

• Manageability: Standardization, Fine grained policy management

• Performance:

✓ vSAN Architecture

✓ All-Flash HardwareVMworld 2017 Content: N

ot for publicatio

n or distribution

Page 21: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

1

3Superior operating experience

4

Why Big Data on All Flash vs HDDs?

2

CapEx Savings! • Significant reduction in SSD Prices• Cost/GB is at par with HDDs

• fewer drives• No moving parts and Smaller form factors• Up to 5X better MB/Watt5

$

Superior Performance• Up to 13x better transaction throughput1• Up to 200x better read performance2• Write IOPs up to 95x better

Innovation• Standardized interface for PCI Express® SSDs • Capacity: 3D NAND• Ultra high Performance: 3D XPoint™

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 22: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Test Server Configuration

26

2S Intel® Xeon® E5-2690 v4

14 Cores/28 Threads

(35M Cache, 2.60 GHz)

256 GB (16x 16GB) DDR4 2133

MHz RAM

2x Intel® Ethernet Server

Adapter SFP+ X520-DA2 (2 x

10 Gbps)

Dual Port, PCIe v2.0 (5.0GT/s),

x8 Lane

Boot: Intel® DC S3610 Series

Cache: 4x Intel® DC P3700 800GB

Capacity: 12x Intel® DC S3610 1.6TB

2x LSI 3008-8i 12Gbps RAID

controllers

Intel® Server Board S2600WTTR System

7

Intel® Server Products for Cloud – vSAN* Ready Nodes

http://www.intelserveredge.com/intel-cloud-block-vsan/

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 23: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

vSphere and vSAN Host Configuration

27

1

2

7

ESXi 6.0.0 Update 2

Arista 7150S-64 SFP+

(10GbE)

vSwitch 2

2x10GbE (bonded)

VM Network

vSwitch1

2x10GbE (bonded)

vSAN Network

Mgmt. vLAN

vCenter Server

Appliance

vSwitch 1

1GbE

Mgmt Network

Max 4 Disk Groups Per Host

1 x Intel® DC P3700 800GB 3 x Intel® DC S3610 1.6TB

Caching Tier Capacity Tier

Number of Disk Stripes: 3 Disk

Number of Failures to Tolerate: 0

4 VMs Per Host

CentOS 6.7 x86_64

14 vCPU, 48GB vRAM

OS: 1x40GB LSI Logic

(Thin Provision)

Data: 4x400GB LSI Logic

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 24: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Hadoop Configuration…

28

CDH Management

AP ES HM SM

S

Name Node

NN

G

S

JH

SRM

B

400GB/vol

41

CDH vols

40GB (Thin)

Boot vol

400GB/vol

41

NN vols

40GB (Thin)

Boot vol

Data Node 1

SNN

NM

S

DN

400GB/vol

41

Data vols

40GB (Thin)

Boot vol

NM

DN

400GB/vol

41

Data vols

40GB (Thin)

Boot vol

Data Node 2-27

vSwitch 2 - 2x10GbE (bonded) VM Network

HDFS Services (NN – Name Node, SNN – Secondary Name Node, DN – Data Node)

YARN Services (G – Gateway, RM – Resource Manager, JHS – Job History Server, NM – Node Manager)

Zookeeper Services (S – Server)

Legend

Cloud Mgmt Service (AP – Alert Publisher, ES – Event Server, HM – Host Monitor, SM – Service Monitor)

Cloudera Hadoop - 5.7.0

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 25: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

… Hadoop Configuration Details

29

CDH Tuning

Parameter Value

dfs.blocksize 256 MiB

dfs.replication 3

dfs.client.use.datanode.hostname TRUE

mapreduce.task.io.sort.mb 400 MiB

yarn.scheduler.minimum-allocation-mb 2 GiB

mapreduce.map.memory.mb 2.1 GiB

mapreduce.reduce.memory.mb 2.1 GiB

mapreduce.map.cpu.vcores 1

mapreduce.reduce.cpu.vcores 1

mapreduce.job.heap.memory-mb.ratio 0.8

Test Scenarios

Linux tuning (see backup for details)

CDH Tuning (table on left)

Tera Benchmark Suite (1TB+)

Different vSAN configs – disk groups,

FTT, host affinity, etc.

Identify optimal vSAN

system configuration

for analytics

workloads

Goals

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 26: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Workloads - MapReduce

30

TeraSort Suite

– Most popular Hadoop test, supplied with distribution, exercises CPU, memory, disk,

network

– TeraGen – generates specified number of 100 byte records – 1, 3, and 5 TB used in

tests

– TeraSort – sorts TeraGen output

– TeraValidate – validates TeraSort output is in sorted order

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 27: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Results - Disk Groups (DGs)

31Having more disk groups spread IO and faster response times

vSAN FTT=1, Hadoop ds.rep=2

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 28: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Results – Host Affinity (Tech Preview)

32Host affinity avoids replicas on the same host, provides faster response time

4DGs, vSAN FTT=0, Hadoop dfs.rep=3

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 29: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Best Practices on Running Hadoop on vSAN

33

• 4 VMDKs per VM helps spread IO across SCSI controllers for uniform I/O

distribution

• HDFS, vSAN provide fault tolerance with replication and erasure codes

• vSAN host affinity option (Tech Preview) provides data locality and best

performance

• Using multiple disk groups helps IO performance

CONFIDENTIAL

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 30: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

MongoDB on vSAN

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 31: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

35

MangoDB: Accommodates Large Volumes of Rapidly Changing Structured, Semi-structured and Unstructured Data

Overview• MongoDB is an open source document

database• NoSQL DB• Built on an architecture of collections

and documents• Documents comprise sets of key-value

pairs and are the basic unit of data in MongoDB

• Collection: A grouping of MongoDB documents

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 32: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Solution Overview and Test Environment

CONFIDENTIAL 36

Test Environment

• vSAN 6.6 and vSphere 6.5.0d• CentOs: 7.3• MangoDB 3.4 (community version)• Yahoo Cloud Serving Benchmark (YCSB)

0.12.0

MongoDB Definitions

• ConfigDB stands for the configuration database for MongoDB cluster’s internal use

• Mongos: “MongoDB Shard” ( Routing Services )

• Mongod: primary process for handing data requests, manage data access, and perform background management ops

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 33: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Solution Configuration: Hardware and Software

37

Component Specification

Server SuperMicro SSG-2027R-AR24NV

CPU cores 2 sockets, 10 cores 3.0GHz with hyper-threading enabled

RAM 512GB DDR4 RDIMM

Network adapter 2 x Intel 10 Gigabit X540-AT2, + I350 1Gb Ethernet

Storage adapter 2 x 12Gbps SAS PCI-Express

Disks SSD: 2 x 3,000GB NVMe drive as cache SSD SSD: 8 x 400GB SATA drive as capacity SSD

Hardware

Component CPU

CoresMemory OS Disk

Data

Disk

ConfigDB ( 3 instances) 8 32GB 32GB 200GB

Mongos 8 64GB 32GB None

Mongod 8 64GB 32GB 200GB

Mongos: The routing service in MongoDBMongod: The daemon for data processingBaseline: 100 Million records ( 128 GB )

Performance Analysis

The rule of thumb is that the aggregated CPU cores and memory should not exceed the physical resources.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 34: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

YCSB Workload Types for Performance Evaluation

50% Read

50% Update

Workload A

95% Read

5% Update

Workload B

38

YCSB: NoSQL DB performance assessment tool (open source)

Performance Analysis

Update heavy workload Read mostly workload

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 35: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different Client Threads – Optimal to be 128

CONFIDENTIAL 39

Performance Analysis

Workload A Workload B

Using 128 client threads leads to a maximum performance and keeping the latency lower than that with higher threads

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 36: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different YCSB Operation Count – Performance at Steady State ~25M

CONFIDENTIAL 40

Performance Analysis

Workload A Workload B

We decided to use 25M ops count across our tests to be consistent.Change this based on your requirements

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 37: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Parameter Settings in the Baseline Testing

PARAMETER VALUE

Monogd data server number 4

Mongod data disk size 200GB

Mongod CPU cores number 8

Mongod memory size 64GB

Mongos CPU cores number 8

Mongos memory size 64GB

Enable Mongod replica set? No

vSAN stripe width setting 1

vSAN FTT 1

vSAN object checksum Disabled

YCSB client threads 128

Database entry size 100 million

Operation count 25 million

MongoDB durability ‘w’ option 1 (true)

MongoDB durability ‘j‘ option 1 (true)

CONFIDENTIAL 41

Performance Analysis

Performance baseline without any optimization• Workload A, the ops/sec value was 28,529

with average read latency 0.64ms and average update latency 8.3ms.

• Workload B, the ops/sec value was 119,554 with average read latency 0.81ms and average update latency 5.7ms.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 38: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different Virtual CPU Cores and Memory Configurations

CONFIDENTIAL 42

Performance Analysis

Workload A Workload B

Increasing Memory/CPU could have major impact on performance

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 39: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different Database Size in Terms of Entries

CONFIDENTIAL 43

Performance Analysis

Workload A Workload B

For the larger size DBs increase Memory/CPU to avoid performance penalties( This result is based on 8 CPU cores and 64GB memory for the MongoDB servers )

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 40: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of MongoDB Replica Set Setting

CONFIDENTIAL 45

Performance Analysis

Workload A Workload B

(rs=1 means no application replication and rs=3 means application replication)Turning off MongoDB’s replica set provides better performance and lower latency

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 41: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different vSAN Object Stripe Width

CONFIDENTIAL 46

Performance Analysis

Increasing Stripe Width could positively impact performance, however the impact is not very high. Recommended only when performance is low

Workload A Workload BVMworld 2017 Content: N

ot for publicatio

n or distribution

Page 42: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Evaluate the Impact of Different vSAN RAID Levels

CONFIDENTIAL 47

Performance Analysis

Workload A Workload B

RAID 5 could save storage space but also would lead to a lower throughput and higher latency. Users should consider the tradeoff between storage space and performance.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 43: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Best Practices on Running MongoDB on vSAN

• Before the deployment: Size the environment properly

• Use MongoDB shards

• Optionally turn off MongoDB’s replica set and leverage vSphere HA instead

• Appropriate CPU and memory size is essential

• Appropriate data durability option to trade-off between performance and availability

• Try larger vSAN stripe width if low performance

• Follow the MongoDB best practices

CONFIDENTIAL 52

Performance Analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 44: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Apache Cassandra on vSAN

• DataStax delivers Apache Cassandra in a database platform purpose-built for the performance and availability demands of Web, Mobile, and IOT applications

• Solution Overview—DataStax Enterprise on vSAN at HERE

CONFIDENTIAL 53

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 45: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Splunk on vSAN

54

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 46: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Splunk Enterprise on vSAN

➢ vSAN - Storage for

hot/warm/cold buckets

➢ Hardware

• 1 VxRail cluster

• All-flash SSDs storage

➢ Application layer

• Dedicated VM for Splunk

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 47: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Splunk Enterprise on vSAN and Isilon

➢ vSAN + Isilon

• vSAN for Splunk hot/warm

buckets

• Isilon for Splunk cold

buckets ( Isilon X410 )

➢ Hardware

• 1 VxRail cluster

• All-flash SSDs storage

• 1 Isilon cluster

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 48: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Splunk-validated Sizing Configurations

Departmental

deployment

Small enterprise

distributed

deployment

Medium enterprise

distributed

deployment

Medium enterprise

indexer cluster

deployment

Data volume 50GB/day 500GB/day 1 TB/day 1 TB/day

Retention 90-day 90-day 90-day • 7-day retention

for hot/warm

• Configurable

retention for cold

buckets on Isilon

Concurrent users Less than 8 Less than 64 More than 64 More than 64 VMworld 2017 Content: N

ot for publicatio

n or distribution

Page 49: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

One VxRail node for up to 50 GB/day with 90-day retention

Departmental Deployment Validated Sizing Configuration

Hardware Configuration

Deployment Configuration

Performance Analysis

VxRail Model Specification # of Nodes Storage Required

VxRail E460F 40 x 2.2GHz cores

384GB (24 x 16GB) RAM

1 Disk Group with 800GB

Cache SSD

5.235TB (3 x 1.92TB SSD) raw

capacity

1 3.3 TB**

(includes space for OS and

20% reserved for free space)

Instance Role QtyPhysical

Cores/vCPUsMemory OS Storage Indexer Storage

Single Instance 1 32/64 256GB 300GB 3TB

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 50: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Four VxRail nodes for up to 500 GB/day (distributed) or up to 250 GB/day (clustered) with 90-day retention

Small Enterprise Distributed Deployment

Instance Role Qty Physical Cores/vCPUs Memory OS Storage Indexer Storage

Search Head 1 32/64 256GB 300GB 0

Indexer 2 32/64 256GB 300GB 13.9TB

Admin Server 1 32/64 256GB 150GB 0

VxRail Model Specification # of Nodes Storage Required

VxRail E460F

(per node)

40 x 2.2GHz cores

512GB (16 x 32GB) RAM

2 Disk Groups, each with 800GB Cache SSD

20.94 TB (6 x 3.84TB SSD) raw capacity

4

VxRail Cluster 83.8 TB raw capacity

40.3 TB effective usable capacity

27.8 TB

(includes space for OS and 20% reserved for

free space)

Hardware Configuration

Deployment Configuration

Performance Analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 51: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Seven VxRail nodes for up to 1 TB/day (distributed) with 90-day retention

Medium Enterprise Distributed Deployment Validated Sizing Configuration

Instance Role Qty Physical Cores/vCPUs Memory OS Storage Indexer Storage

Search Head 1 32/64 256GB 300GB 0

Indexer 5 32/64 256GB 300GB 10.8TB

Admin Server 1 32/64 256GB 150GB 0

VxRail Model Specification # of Nodes Storage Required

VxRail E460F

(per node)

40 x 2.2GHz cores

384GB (24 x 16GB) RAM

2 Disk Groups, each with 800GB Cache SSD

20.94 TB (6 x 3.84TB SSD) raw capacity

7

VxRail Cluster 146.6 TB raw capacity

70.7 TB effective usable capacity

56.4 TB

(includes space for OS and 20% reserved for

free space)

Hardware Configuration

Deployment Configuration

Performance Analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 52: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Seven VxRail nodes with Isilon for up to 1 TB/day (clustered) with 7-day retention for hot/warm buckets and configurable retention for cold buckets

Medium Enterprise Indexer Cluster Deployment

Instance Role Qty Physical Cores/vCPUs Memory OS Storage Indexer Storage

Search Head 1 32/64 256GB 300GB 0

Indexer 5 32/64 256GB 300GB 2.1TB

Admin Server 1 32/64 256GB 150GB 0

VxRail Model Specification # of Nodes Storage Required

VxRail E460F

(per node)

40 x 2.2GHz cores

384GB (24 x 16GB) RAM

1 Disk Group with 800GB Cache SSD

5.235TB (3 x 1.92TB SSD) raw capacity

7

VxRail Cluster 36.6 TB raw capacity

16.1 TB effective usable capacity

12.5 TB

(includes space for OS and 20% reserved for

free space)

Hardware Configuration

Deployment Configuration

Performance Analysis

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 53: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Linear Scalability of vSAN Cluster size for Various Workloads

All Read Workload (4KB) Sequential R/W Workload (256KB)

Performance Analysis

Shows the scalability test result for throughput and latency for All Reads and Sequential Read/Write workloads. The testing results show that vSAN is close- to-linear scalability.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 54: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Testing: Linear Scalability of vSAN Cluster Size for Mixed Workloads

Mixed R/W Workload (4KB) Mixed R/W Workload (32KB)

Performance Analysis

Workload A Workload B

Shows the scalability test result for latency and throughput of the mixed workloads. The testing results show that vSAN has close-to-linear scalability.

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 55: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Best Practices for Running Splunk on vSAN

• For performance use vSAN and if you need to store a large amount of cold data use vSAN+Isilon

• Increase the number of disk stripes as needed for highest performance

• Run proactive disk rebalance after add new nodes

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 56: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Summary and Call to Action

65

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 57: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Summary and Call to Action

• Consolidation and simplified management driving the need for deploying Big Data

workloads on vSAN

• All Flash vSAN Ready Nodes deliver cost efficient performance, reliability and

hassle free manageability

• Consider evaluating “All Flash” vSAN for Big Data workloads today!!

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 58: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Additional Resources

• White Paper - Intel and VMware White Paper : New Era of Hyper-Converged Big Data Using

Hadoop with All-Flash VMware VSAN

• Using Splunk Enterprise With VxRAIL Appliance and Isilon for Aanlysis of Machine Data

• Blogs - Big Data on All-Flash vSAN? Of Course!

• Storage Hub - Mongo DB on VMware vSAN

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 59: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Big Data Sessions

Tuesday, 29th August 2017

• [VIRT1997BU] Machine Learning and Deep Learning on VMware vSphere: GPUs are Invading the Software-

Defined Data Center: 5:30 -6:30pm

• [VIRT2274GU] Group Discussion on Virtualizing Big Data and Machine Learning : 5:30-6:30pm

Wednesday, 30th August 2017

• [LDT2800PU] Harnessing the Power of Data in a Virtual World: 9:30 – 10:30am

• [MTE4789VIRT] Meet the Experts Session : 11:15am-12:00pm – Table 7

• This is an opportunity to meet with VMware’s big data people in a small group context. Booking your time-slot

ahead of the meeting is advised here.

Thursday, 31st August 2017

• [VIRT1445BU] Extreme Performance Series: Fast Virtualized Hadoop and Spark on All-Flash Disks : 10:30-

11:30am

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 60: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

3 Easy Ways to Learn More about vSAN

69

• Live at VMworld

• Practical learning of vSAN, VxRail and more

• 24x7 availability online– for free!

vSAN Sizer

vSAN Assessment

New vSAN Tools

• StorageHub.vmware.com

• Reference architectures, off-line demos and more

• Easy search function

• And More!

Storage Hub Technical Library Hands-On Lab

Test drive vSAN

for free today!

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 61: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Nerd Out With These Key vSAN Activities at VMworld

#HitRefresh on your current data center and discover the possibilities!

Earn VMware digital badges to

showcase your skills

• New 2017 vSAN Specialist

Badge

• Education & Certification Lounge:

VM Village

• Certification Exam Center:

Jasmine EFG, Level 3

Become a

vSAN Specialist

Learn from self-paced and expert

led hands on labs

• vSAN Getting Started Workshop (Expert led)

• VxRail Getting Started (Self paced)

• Self-Paced lab available online 24x7

Practice with

Hands-on-Labs

Discover how to assess if your IT

is a good fit for HCI

• Four Seasons Willow Room/2nd floor

• Open from 11am – 5pm Sun, Mon, and Tue

• Learn more at Assessing & Sizing in STO1500BU

Visit SDDC

Assessment Lounge

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 62: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

HOURS

Sunday, August 27th 1:00pm – 5:00pm Monday, August 28th 11:00pm – 5:00pmTuesday, August 29th 11:00am – 5:00pm

SDDC

ASSESSMENT

LOUNGE

Four Seasons HotelDesert Willow Room

vSphere Optimization Assessments (VOA)

Hybrid Cloud Assessment

Virtual Network Assessment

vSAN Assessment

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 63: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

72

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 64: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Backup

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 65: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

TEST TOOL

• Open source benchmark tool

• Emulate the disk or network I/O load

• Works for both single and clustered systems

• Can be used to measure:

– Performance of disk and network controllers

– Bandwidth and latency capabilities of buses

– Network throughput to attached drives

– Share bus performance

– System-level hard drive performance

– System-level network performance

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 66: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

TEST SCENARIO

• Iometer Workload Profile

Workload Profile I/O Size Read/Write Ratio Random/Sequential Ratio Outstanding I/O

All Read 4KB 100% Read 100% Random 16

Mixed Read/Write 4KB 70% Read, 30% Write 100% Random 4

Mixed Read/Write 32KB 70% Read, 30% Write 100% Random 2

Sequential Read 256KB 100% Read 100% Sequential 8

Sequential Write 256KB 100% Write 100% Sequential 8

vCPU RAM VMDK

4 4GB 10 * 9GB eager-zeroed-thick VMDK

• Iometer VM Configuration

• Performance Metrics

Metrics Unit

IOPS I/O per second

Latency Millisecond

Throughput Megabyte per second

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 67: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

HARDWARE & SOFTWARE RESOURCES

Hardware Components Details

VxRail E460F 2 Intel® Xeon® Processors E5-2698 v4 @ 2.20 GHz per node

384 GB (24 x 16 GB) or 512 GB (16 x 32 GB)

800 GB per disk group (1 or 2 disk groups)

5.235TB (3 x 1.92TB) or 20.94TB (6 x 3.84TB SSD) capacity per

node**

2 x 10 GbE SFP+ per node

Switch Fabric interconnect

Isilon X410 2 Intel® Xeon® Processors 2.0 GHz per node

128 GB RAM per node

3.2 TB SSD storage

64 TB HDD storage

2 x 10 GbE SFP+ per node

2 x 1 GbE per node

Hardware Configuration

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 68: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

HARDWARE & SOFTWARE RESOURCES

Software Components Details

Splunk Enterprise 6.5.0

Splunk Universal Forwarder 6.5.0

RedHat Linux 64-bit 6.7

VMware vSphere Enterprise 6.0 U2

VMware vCenter Server 6.0 U2

VMware Virtual SAN Enterprise 6.2

VMware vRealize Log Insight 3.3.1

VxRail Manager 4.0

OneFS 8.0.0.3

Software Configuration

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 69: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Summary and Call to Action

78

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 70: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Summary and Call to Action

• Consolidation and simplified management driving the need for deploying Big Data

workloads on vSAN

• All Flash vSAN Ready Nodes deliver cost efficient performance, reliability and

hassle free manageability

• Consider evaluating “All Flash” vSAN for Big Data workloads today!!

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 71: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Q&A

80

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 72: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Intel PlatformsTick-Tock Development Model

Intel® MicroarchitectureCodename Nehalem

Intel® MicroarchitectureCodename Sandy Bridge

Intel® MicroarchitectureCodename Haswell

Tock Tock TockTick Tick Tick

Nehalem

45nm

New Micro-architecture

Westmere

32nm

New ProcessTechnology

Sandy Bridge

32nm

New Micro-architecture

Ivy Bridge

22nm

New ProcessTechnology

Haswell

22nm

New Micro-architecture

Broadwell

14nm

New ProcessTechnology

Grantley Platform (Today)Romley PlatformThurley Platform

Wellsburg PCHPatsburg PCHTylersburg PCH

Xeon E5 v4 socket compatible with v3 series

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 73: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

82

BIOS SetupProfiles

– CPU Power and Performance Policy: Performance

– Workload Configuration: Balanced

– Memory RAS Configuration: Maximum Performance

– Fan Profile: Performance

Enabled– Hyper-Threading

– NUMA Optimized

– Enhanced Intel SpeedStep® Tech

– Intel® Turbo Boost Technology

– Uncore Frequency Scaling

– Performance P-Limit

Disabled– Cluster on Die

– Early Snoop

– CPU C States

– Energy Efficient TurboVMworld 2017 Content: Not fo

r publication or distri

bution

Page 74: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

83

Test Setup (Linux OS)/etc/sysctl.conf

vm.swappiness=10

net.core.rmem_max = 16777216

net.core.wmem_max = 16777216

net.ipv4.tcp_rmem = 4096 87380 16777216

net.ipv4.tcp_wmem = 4096 65536 16777216

net.core.netdev_max_backlog = 250000

/etc/security/limits.conf

* soft nofile 65536

* hard nofile 1048576

* soft nproc 65536

* hard nproc unlimited

* hard memlock unlimited

CPU Profile

echo performance> /sys/devices/system/cpu/cpu{0..n}/cpufreq/scaling_governor

Huge Page

echo never> /sys/kernel/mm/transparent_hugepage/defrag

echo never> /sys/kernel/mm/transparent_hugepage/enabled

Network

ifconfig <eth> mtu 9000

ifconfig <eth> txqueuelen 1000

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 75: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Performance Results – Replication HDFS v/s vSAN (Tech Preview)

84*Other names and brands may be claimed as the property of others.

HDFS replication with host affinity delivers optimal performance

vSAN 4DGs, HDFS replication with host affinity

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 76: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

85

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 77: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Shared Nothing Architecture

Typical Applications

NoSQL:

MongoDB

Cassandra

Couchbase

Big Data:

Apache Hadoop

Apache Spark

Others

CONFIDENTIAL86

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 78: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

MongoDB Failure Testing

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 79: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Failure Testing

From the perspective of MongoDB’s replica set setting, the test is divided into two parts:

• Enable MongoDB’s replica set which means there are three virtual machines in a MongoDB replica set, and thus we use ‘rs=3’ as the short term.

• Disable MongoDB’s replica set which means there is only one virtual machine in a MongoDB replica set, and thus we use ‘rs=1’ for the short term.

From the perspective of failure, we conducted two types of failure:

• A physical host failure which will power off all the running virtual machines residing on it. When a host fails, VMware vSphere High Availability will restart the impacted virtual machines on another host. This is the backend feasibility of setting ‘rs=1’ while keeping a low service downtime.

• A physical disk failure in a vSAN datastore which will cause a vSAN object to enter a degraded state. With the storage policy set with FTT=1, the object can still survive and serve I/O. Thus from the virtual machines’ perspective, there is no interruption of service.

CONFIDENTIAL88

Test Overview

VMworld 2017 Content: Not fo

r publication or distri

bution

Page 80: STO3192BU Deploying Big Data on HCI Powered by vSAN · PDF fileDeploying Big Data on HCI Powered by vSAN 1 VMworld 2017 ... Cloud Native Apps Databases (SQL/Oracle) ROBO Management

Failure Testing

REPLICA SET

CONFIGURATION

FAILURE

TYPE

SERVICE

INTERRUPTION

TIME

RECOVERY METHOD

rs=1 Host Failure Around 120 seconds

vSphere HA restarted the failed

virtual machines.

rs=1 Disk Failure No interruption vSAN rebuilt the failed

components.

rs=3 Host Failure Around 10 seconds

1. MongoDB’s replica set failed

over from the primary node to the

secondary node.

2. vSphere HA restarted the failed

virtual machines.

rs=3 Disk Failure No interruption vSAN rebuilt the failed

components.

CONFIDENTIAL89

Failure Testing Result

VMworld 2017 Content: Not fo

r publication or distri

bution