storage on aws - amazon s3 · pdf fileebs volume types comparison magnetic general purpose...

48
Storage on AWS © 2015 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services, Inc.

Upload: vutruc

Post on 15-Feb-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Storage on AWS

©2015AmazonWebServices,Inc.anditsaffiliates.Allrightsserved.Maynotbecopied,modified,ordistributedinwholeorinpartwithouttheexpressconsentofAmazonWebServices,Inc.

Agenda

• Storage Primer• Block Storage• Shared File Systems• Object Store• On-Premises Storage Integration• Structured Data Store

0Storage Primer

Block vs File vs ObjectBlock StorageRaw StorageData organized as an array of unrelated blocksHost File System places data on diske.g.: Microsoft NTFS, Unix ZFS

File StorageUnrelated data blocks managed by a file (serving) systemNative file system places data on disk

Object StorageStores Virtual containers that encapsulate the data, data attributes, metadata and Object IDsAPI Access to dataMetadata Driven, Policy-based, etc

Structured storage - Databases

Relational DatabasesStatic SchemaHighly structured table organizationRigid data format

Document StoreDynamic SchemaKey/Value DatabaseCollection of complex documentsArbitrary, nested data format

Storage - Characteristics

Durability Availability Security Cost Scalability Performance Integration

Measure of expected data loss

Measure of expected downtime

Security measures in place

Amount per storage unit, e.g. $ / GB

Upwardflexibility

Performancemetrics

Ability to interact with

Some of the ways we look at storage

AWS has a variety of storage optionsAmazon EBS (Elastic Block Storage)

Amazon Elastic File System (EFS)

Amazon EC2 Instance Store (Ephemeral Volumes)

Amazon S3 (Simple Storage Service)

Amazon Glacier

AWS Storage Gateway

Amazon Import/Export Snowball

AWS also has a variety of database options

Amazon EC2 (Self Managed)

Amazon RDS (Relational Database Service)

Amazon DynamoDB

Amazon ElastiCache

Amazon Redshift

1Block Storage

Amazon EBS

• Persistent block level storage for EC2• Pay only for what you provision• Native redundancy and write cache• Consistent and low-latency performance• Optimized for random I/O• Native support for encryption at rest (data volumes)

Amazon EBS

• Network attached block device– Independent data lifecycle– Virtual disks– Multiple volumes per EC2 instance– Only one EC2 instance at a time per volume– Can be detached from an instance and attached to a different one

• Raw block devices– Unformatted block devices– Ideal for databases, filesystems

• Available in multiple types

EBS Volume Types ComparisonMagnetic General Purpose

(SSD)Provisioned IOPS (SSD)

Performance Lowest Cost Burstable PredictableUse Cases Infrequent Data

AccessBoot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSDMax IOPS 100 on average with

the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations - Free

$.125/GB/Month$.065/provisioned IOPS

IOPS Token Bucket Model

• Each token represents an “I/O credit” that pays for one read or one write.

• A bucket is associated with each General Purpose (SSD) volume, and can hold up to 5.4 million tokens.

• Tokens accumulate at a rate of 3 per configured GB per second, up to the capacity of the bucket.

• Tokens can be spent at up to 3000 per second per volume.

• The baseline performance of the volume is equal to the rate at which tokens are accumulated — 3 IOPS per GB per second.

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

EBS Provisioned IOPS

• EBS Optimized Instances• Dedicated storage throughput

• Predictable Performance• 100-20000 IOPS per volume• Single digit millisecond latency

• Performance Design• Deliver within 10% of PIOPs, 99.9% of

the time

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Enhanced Throughput for PIOPS & GP2 Volumes

• Maximum attainable throughput to each volume was doubled to 128 MB/s read or write traffic

• An I/O request of up to 256 KB is now counted as a single I/O operation (IOP)

• In many cases you can configure the block size used by your application

• Capable of dramatically reducing your storage costs

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Amazon EBS at 20,000 IOPS

• Provisioned IOPS (SSD)– Max Volume 16 TB– Max I/O rate 20,000 IOPS– Max throughput 320 MB/s

• General Purpose (SSD)– Max Volume 16 TB– Max I/O rate 10,000 IOPS– Max throughput 160 MB/s

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Internet

AWS Cloud

EBS Snapshots

EC2 Availability Zone

EC2

Amazon S3

EBS

EC2 EC2

EBS EBS EBS EBS EBSEBS Snapshot

EBS Snapshot

EBS Snapshot

EBS Snapshot

EBS Snapshot

Create Snapshot

Clone From Snapshot

EBS Volume

How Do Snapshots Work?Time

Snapshot 1 Snapshot 2 Snapshot 3

S3

Block 1Block 2Block 3Block 4

Chunk 1Chunk 2Chunk 3Chunk 4

EC2 Instance Store (Ephemeral Volumes)

• Free with your EC2 instance– SAS and SSD options– Size/type based on instance type

• Local, direct attached resource• Consistent sequential reads and writes• Use only for non-persistent data

2Shared file system

Elastic File System (EFS)

• Fully managed file system for EC2 instances• Provides standard file system semantics• Works with standard operating system APIs• Sharable across thousands of instances• Elastically grows to petabyte scale• Delivers performance for a wide variety of workloads• Highly available and durable• NFS v4–based

EFS – MountingEFS

EC2EC2 EC2 EC2EC2

EFSDNS Nameavailability-zone.file-system-id.efs.aws-region.amazonaws.com

Mountonmachinesudo mount -t nfs4 mount-target-DNS:/ ~/efs-mount-point

EC2

3Object Stores

Amazon S3 (Simple Storage Service)

• Web accessible object store• Pay for exactly what you use• Highly durable (99.999999999% design)• Limitlessly scalable• Natively online• Two flavors:

– Standard Storage - $0.0300* per GB / mo– Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data

retrieval cost* (US East (N Virginia) pricing)

Amazon S3 (Simple Storage Service)• Parallel I/O for max speed (Multipart Upload, Ranged GETs)

• Resource-level IAM permissions• Bucket Policies & ACLs• Direct access through APIs• Server Side Encryption• Static Website Hosting• Data Lifecycle Rules

Amazon Glacier

• Low-Cost Archival Storage• Secure

• SSL & AES-256

• Durable• Designed for 99.999999999% durability

• Optimized for data archiving and backup• Suitable for RTO measured in hours• Includes storage costs and retrieval costs

• $0.007 per GB/Month (US East pricing)• Integrated with S3

Amazon CloudFront• Easy-to-use Content Delivery Network (CDN)• Pay-as-you-go pricing• Multiple origins: S3, EC2, on-premise

• Worldwide network of 53+ edge locations• Video streaming• Geo Restriction• Custom SSL Certificates• Dynamic Content• POST/PUT

4On-Premises

Storage Integration

AWS Storage Gateway• VM Appliance run on-premise• Creates iSCSI volume mount points• Directly interfaces with S3 or Glacier

• Gateway-Stored Volumes• Gateway-Cached Volumes• Virtual Tape Library

Amazon Import/Export Snowball

• Petabyte scale data transport• Uses secure appliances• Economic and fast• Faster than Internet for significant data sets• Import into S3

5Structured Data

Stores

A fully managed SQL database service

Simple to deploy and scale

Without any operational burden

Reliable and cost effective

Choice of Database engines

Amazon RDS

Amazon Aurora

If you host your databases on-premises

Power, HVAC, netRack & stack

Server maintenance

OS patches

DB s/w patchesDatabase backups

ScalingHigh availability

DB s/w installs

OS installation

you

App optimization

If you host your databases on-premises

Power, HVAC, netRack & stack

Server maintenance

OS patches

DB s/w patchesDatabase backups

ScalingHigh availability

DB s/w installs

OS installation

you

App optimization

If you host your databases in EC2

Power, HVAC, netRack & stack

Server maintenance

OS patches

DB s/w patchesDatabase backups

ScalingHigh availability

DB s/w installs

OS installation

you

App optimization

If you host your databases in EC2

OS patches

DB s/w patchesDatabase backups

ScalingHigh availability

DB s/w installs

you

App optimization

Power, HVAC, netRack & stack

Server maintenanceOS installation

If you choose a managed DB service like RDS

Power, HVAC, netRack & stack

Server maintenance

OS patches

DB s/w patchesDatabase backups

App optimization

High availability

DB s/w installs

OS installation

you

Scaling

Traditional Database Architecture

App/Web Tier

Client Tier

RDBMS

onedatabaseforallworkloads

• key-value access• complex queries• transactions• analytics

Traditional Database Architecture

App/Web Tier

Client Tier

RDBMS

Data Tier

Cache Data Warehouse

RDBMSNoSQL

App/Web Tier

Client Tier

bestdatabaseforeachworkload

Cloud Data Tier Architecture

Data Tier

Cache Data Warehouse

RDBMSNoSQL

key/valuesimplequery

hotreads analytics

complexqueries&transactions

Workload Driven Data Store Selection

Data Tierkey/value

simplequery

hotreads analytics

complexqueries&transactions

Workload Driven Data Store Selection

Amazon ElastiCache

Amazon DynamoDB

Amazon Redshift

Amazon RDS

Amazon DynamoDB

• Fully managed NoSQL database service• Massively scalable, distributed key/value store• Reserved capacity model• Fast and predictable• Built-in fault tolerance• Strong consistency model• Unlimited potential storage and throughput

Amazon ElastiCache

• In-memory cache in the cloud• Improve latency and throughput for read-heavy

workloads• Supports open-source caching engines

– Memcached– Redis

• Examples– Caching of MySQL database query results– Caching of complex query post-processing results

Amazon Redshift• Fast and powerful, petabyte-scale data

warehouse– Fully managed– Highly-parallel– Columnar Data Store

• Data warehouse-type queries– Aggregations, historical analysis– BI Tool integration

• Grow with your data– 160 GB à 1.6 PB

• SSD and SAS Options– SSD provides 10-15x perf @ 5.5x the cost/tb/year

Using Multiple Storage Options Together

• EBS + S3: snapshots

• S3 + EC2 Instance Store: caching

• S3 + CloudFront: edge caching

• S3 + Glacier: data lifecycle archiving

• RDS + ElastiCache: cached queries

It’s all aboutchoice

Performance-orientedCost-oriented

Any Questions?