right-sizing your big data infrastructure your big data infrastructure tom lyon founder & chief...

7
Right-Sizing Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, 2017

Upload: dodung

Post on 30-Mar-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

Right-Sizing Your Big Data Infrastructure

Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, 2017

Page 2: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

Cluster Out of Balance?

2

• Too little CPU, too much disk?

• Too little disk, too much CPU?

• How can you evolve the cluster balance as workloads change?

Page 3: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

Too Many Silos? Too many SKUs?

3

• Each type of cluster “wants” a different amount of disk per server • Hadoop Data Lake • Dev/Test • Hbase • Kafka • Cassandra • …

• Fixed silos per cluster type lead to madness • No resource sharing • No elasticity • Too many server types / SKUs

Page 4: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

Hadoop Storage Needs vs Supposed Solutions

4 DriveScale Confidential Information © 2016

Locality Converged compute &

storage

Replication Extreme Read BW

Erasure Coding

Examples

Hadoop HDFS ✔ ✔ ✔ ✔ ✖

NAS - Enterprise ✖ ✖ ✔ ✖ ✔ Isilon, Qumulo, Gluster

NAS - HPC ✖ ✖ ✖ ✔ ✖ Lustre, GPFS

SAN/Block - External ✖ ✖ ✔ ✖ ☐ ScaleIO, Ceph, Datera, Cinder, AWS EBS

SAN/Block - Hyperconverged

✖ ✔ ✔ ✖ ☐ Nutanix, ScaleIO, Robin

Object ✖ ✖ ✔ ✖ ✔ AWS S3, Scality, Swift, EMC ECS

Page 5: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

DriveScale is a rack scale architecture, providing composable infrastructure on pooled commodity resources

5

Typical Rack Server Rack Configuration

•  Compute pool: Processor + Memory Servers

•  1U DriveScale Adapter (DA) -Ethernet to SAS

•  Storage pool: Disks in JBODs, connected via SAS to DAs

Rack Scale Architecture

DriveScale Adapter DriveScale Adapter

•  DriveScale composes Logical Nodes (software defined physical nodes)

•  Example: Logical

node might consist of dual proc server and 12 drives across 2 JBODs

Page 6: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

6

DriveScale spans the data center and makes resources fungible

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

DriveScale Adapter

Cluster 1 Balanced

Cluster 2 Data Lake

Cluster 3 Compute Heavy

The boundaries between clusters are “movable” in software

Page 7: Right-Sizing Your Big Data Infrastructure Your Big Data Infrastructure Tom Lyon Founder & Chief Scientist For Strata + Hadoop World, Mar. 15, ... AWS EBS SAN/Block -

DriveScale’s Core Value Propositions

7

Flexible and Responsive Physical Infrastructure

•  Get the infrastructure that’s needed when it’s needed

•  Repurpose resources on demand

Simplicity for Any Scale •  No changes in the app stack

required. •  Equivalent performance to

direct attached drives •  No loss in “data locality”

information

Enterprise Class Solution •  Highly available, Secure,

Reliable •  Use industry standard

servers and storage of your choice