storage on aws - imday-southeast.comimday-southeast.com/storage_on_aws.pdfor distributed in whole or...

63
Storage on AWS © 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services, Inc.

Upload: lamtram

Post on 19-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Storage on AWS

©2017AmazonWebServices,Inc.anditsaffiliates.Allrightsserved.Maynotbecopied,modified,ordistributedinwholeorinpartwithouttheexpressconsentofAmazonWebServices,Inc.

Agenda

• Introduction• Storage Primer• Block Storage• Shared File Systems• Object Store• On-Premises Storage Integration

Introduction: Why choose AWS for storage

Compelling Economics Easy to Use Reduce risk

Speed, Agility, Scale

Pay as you go

No upfront investmentNo commitment

No risky capacity planning

No need to provision for redundancy or overhead

Self service administration

SDKs for simple integration

Durable and Secure

Avoid risks of physical media handling

Reduce time to market

Focus on your business, not your infrastructure

0Storage Primer

Block vs File vs ObjectBlock StorageRaw StorageData organized as an array of unrelated blocksHost File System places data on diske.g.: Microsoft NTFS, Unix ZFS

File StorageUnrelated data blocks managed by a file (serving) systemNative file system places data on disk

Object StorageStores Virtual containers that encapsulate the data, data attributes, metadata and Object IDsAPI Access to dataMetadata Driven, Policy-based, etc

Storage - Characteristics

Durability Availability Security Cost Scalability Performance IntegrationMeasure of expected data loss

Measure of expected downtime

Security measures in place

Amount per storage unit, e.g. $ / GB

Upwardflexibility

Performancemetrics

Ability to interact with

Some of the ways we look at storage

AWS has a variety of storage optionsAmazon EBS (Elastic Block Storage)

Amazon Elastic File System (EFS)

Amazon EC2 Instance Store (Ephemeral Volumes)

Amazon S3 (Simple Storage Service)

Amazon Glacier

AWS Storage Gateway: File Gateway

Amazon Snowball & Snowball Edge

AWS Snowmobile

AWS also has a variety of database options

Amazon EC2 (Self Managed)

Amazon RDS (Relational Database Service)

Amazon DynamoDB

Amazon ElastiCache

Amazon Redshift

1Block Storage

Amazon EBS

• Persistent block level storage for EC2• Pay only for what you provision• Native redundancy and write cache• Consistent and low-latency performance• Optimized for random I/O• Native support for encryption at rest (data volumes)

Amazon EBS

• Network attached block device– Independent data lifecycle– Virtual disks– Multiple volumes per EC2 instance– Only one EC2 instance at a time per volume– Can be detached from an instance and attached to a different one

• Raw block devices– Unformatted block devices– Ideal for databases, filesystems

• Available in multiple types

AWS EBS Features

Durable Secure

Low-latency SSD Consistent I/O PerformanceStripe multiple volumes for higher I/O performance

Identity and Access PoliciesEncryption

ScalableUnlimited capacity when you need itEasily scale up and down

Performance Backup

Designed for five9’s reliabilityRedundant storage across multiple devices within an AZ

Point-in-time SnapshotsCopy snapshots across AZ and Regions

Amazon EBS• Highly available block storage for all types of data

Internet-scale storage Grow without limits

Benefit from AWS’s massive security investments

Built-in redundancyDesigned for 99.999% availability

Low price per GB per monthNo commitmentNo up-front cost

EBS Volume Types ComparisonMagnetic General Purpose

(SSD)Provisioned IOPS (SSD)

Performance Lowest Cost Burstable PredictableUse Cases Infrequent Data

AccessBoot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSDMax IOPS 100 on average with

the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations - Free

$.125/GB/Month$.065/provisioned IOPS

EBS Volume TypesSolid-State Drives (SSD) Hard disk Drives (HDD)

Volume Type General Purpose SSD (gp2)*

Provisioned IOPS SSD (io1)

Throughput Optimized HDD (st1)

Cold HDD (sc1)

Description General purpose SSD volume that balances price and performance for a wide variety of transactional workloads

Highest-performance SSD volume designed for mission-critical applications

Low cost HDD volume designed for frequently accessed, throughput-intensive workloads

Lowest cost HDD volume designed for less frequently accessed workloads

Use Cases • Recommended for most workloads

• System boot volumes• Virtual desktops• Low-latency interactive

apps• Dev and test environments

• Critical business applications that require sustained IOPS performance, or more than 10,000 IOPS or 160 MiB/s of throughput per volume

• Large database workloads

• Streaming workloads requiring consistent, fast throughput at a low price

• Big data• Data warehouses• Log processing• Cannot be a boot volume

• Throughput-oriented storage for large volumes of data that is infrequently accessed

• Scenarios where the lowest storage cost is important

• Cannot be a boot volume

Volume Size 1 GiB - 16 TiB 4 GiB - 16 TiB 500 GiB - 16 TiB 500 GiB - 16 TiBMax. IOPS**/Volume 10,000 20,000 500 250Max. Throughput/Volume†

160 MiB/s 320 MiB/s 500 MiB/s 250 MiB/s

Max. IOPS/Instance 65,000 65,000 65,000 65,000Max. Throughput/Instance

1,250 MiB/s 1,250 MiB/s 1,250 MiB/s 1,250 MiB/s

Dominant Performance Attribute

IOPS IOPS MiB/s MiB/s

*Default volume type**gp2/io1 based on 16KiB I/O size, st1/sc1 based on 1 MiB I/O size† To achieve this throughput, you must have an instance that supports it, such as r3.8xlarge or x1.32xlarge.

IOPS Token Bucket Model

• Each token represents an “I/O credit” that pays for one read or one write.

• A bucket is associated with each General Purpose (SSD) volume, and can hold up to 5.4 million tokens.

• Tokens accumulate at a rate of 3 per configured GB per second, up to the capacity of the bucket.

• Tokens can be spent at up to 3000 per second per volume.

• The baseline performance of the volume is equal to the rate at which tokens are accumulated — 3 IOPS per GB per second.

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

EBS Provisioned IOPS

• EBS Optimized Instances• Dedicated storage throughput

• Predictable Performance• 100-20000 IOPS per volume• Single digit millisecond latency

• Performance Design• Deliver within 10% of PIOPs, 99.9% of

the time

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Enhanced Throughput for PIOPS & GP2 Volumes

• Maximum attainable throughput to each volume now at 500 MB/s read or write traffic (on instance that supports r3.8xl or x1.32xl)

• An I/O request of up to 256 KB is now counted as a single I/O operation (IOP)

• In many cases you can configure the block size used by your application

• Capable of dramatically reducing your storage costs

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Amazon EBS at 20,000 IOPS

• Provisioned IOPS (SSD)– Max Volume 16 TB– Max I/O rate 20,000 IOPS– Max throughput 320 MB/s

• General Purpose (SSD)– Max Volume 16 TB– Max I/O rate 10,000 IOPS– Max throughput 160 MB/s

Magnetic General Purpose (SSD)

Provisioned IOPS (SSD)

Performance

Lowest Cost Burstable Predictable

Use Cases

Infrequent Data Access

Boot volumesSmall to Medium DBsDev & Test

I/O IntensiveRelational & NoSQL

Media Magnetic (HDD) SSD SSD

Max IOPS

100 on averagewith the ability to burst to hundreds of IOPS

Baseline 3 IOPS/GBBurstable to 3,000 IOPS

Consistently performed at provisioned level, up to 20,000 IOPS

Price $.05/GB/Month$.05/million I/O

$.10/GB/MonthI/O Operations -Free

$.125/GB/Month$.065/provisioned IOPS

Internet

AWS Cloud

EBS Snapshots

EC2 Availability Zone

EC2

Amazon S3

EBS

EC2 EC2

EBS EBS EBS EBS EBS EBS Snapshot

EBS Snapshot

EBS Snapshot

EBS Snapshot

EBS Snapshot

Create Snapshot

Clone From Snapshot

EBS Volume

How Do Snapshots Work?Time

Snapshot 1 Snapshot 2 Snapshot 3

S3

Block 1Block 2Block 3Block 4

Chunk 1Chunk 2Chunk 3Chunk 4

EC2 Instance Store (Ephemeral Volumes)

• Free with your EC2 instance– SAS and SSD options– Size/type based on instance type

• Local, direct attached resource• Consistent sequential reads and writes• Use only for non-persistent data

2Shared file system

Elastic File System (EFS)• Fully managed file system for EC2 instances• Provides standard file system semantics• Works with standard operating system APIs• Sharable across thousands of instances• Elastically grows to petabyte scale• Delivers performance for a wide variety of workloads• Highly available and durable• NFS v4–based• Accessible from on-prem servers New!

Amazon EFS is Simple

• Fully managed- No hardware, network, file layer- Create a scalable file system in seconds!

• Seamless integration with existing tools and apps- NFS v4.1—widespread, open- Standard file system access semantics- Works with standard OS file system APIs

• Simple pricing = simple forecasting

1

Amazon EFS is Elastic

• File systems grow and shrink automatically as you add and remove files

• No need to provision storage capacity or performance

• You pay only for the storage space you use, with no minimum fee

2

• File systems can grow to petabyte scale

• Throughput and IOPS scale automatically as file systems grow

• Consistent low latencies regardless of file system size

• Support for thousands of concurrent NFS connections

Amazon EFS is Scalable3

• Designed to sustain AZ offline conditions

• Resources aggregated across multiple AZ’s

• Superior to traditional NAS availability models

• Appropriate for Production / Tier 0 applications

Highly Durable and Highly Available

Example use cases

• Big Data Analytics

• Media Workflow Processing

• Web Serving

• Content Management

• Home Directories

EFS – MountingEFS

EC2EC2 EC2 EC2EC2

EFSDNS Nameavailability-zone.file-system-id.efs.aws-region.amazonaws.com

Mountonmachinesudo mount -t nfs4 mount-target-DNS:/ ~/efs-mount-point

EC2

3Object Stores

Amazon S3 (Simple Storage Service)

• Web accessible object store• Pay for exactly what you use• Highly durable (99.999999999% design)• Limitlessly scalable• Natively online• Two flavors:

– Standard Storage - $0.023 * per GB / mo– Standard – Infrequent Access Storage (min size 128KB) – $0.0125* per GB / mo + Data

retrieval cost* (US East (N Virginia) pricing)

Amazon S3 (Simple Storage Service)• Parallel I/O for max speed (Multipart Upload, Ranged GETs)

• Resource-level IAM permissions• Bucket Policies & ACLs• Direct access through APIs• Server Side Encryption• Static Website Hosting• Data Lifecycle Rules• Amazon Athena – New

– Interactive Query Service that makes it easy to analyze data in Amazon S3 using standard SQL

Object Storage Tiering

S3 Standard

• Primary data• Big Data

Analytics• Small objects• Temporary

scratch space

S3 - IA

• File sync and share

• Active Archive• Enterprise backup• Media transcoding• Geo-

redundancy/DR

Glacier

• Deep/offline archives

• Tape vaulting replacement

• WORM-compliant data

Data tiering using S3 Life Cycle Policies

Object Storage Use Cases

S3

S3-IA

Glacier

Cloud Applications

Big DataAnalytics

Content Distribution Primary Data

File Sync & Share

ActiveArchive

EnterpriseBackup

MediaTranscoding

Disaster Recovery /Geo Redundancy

Deep / Offline

Archives

Tape Vaulting Replacement

WORM Compliant

Data

Temporary & Small

Objects

Lifecycle

AvailableS3: 99.99%

S3-IA: 99.9%

PerformantLow Latency

High Throughput≥ 30 Days≥ 128K

≥ 90 Days

Durable99.999999999%

ScalableElastic capacity No preset limits

> 0K$0.004 / GB per month

$0.0125 / GB per month

“Hot” DataActive and/or

Temporary Data

“Warm” DataInfrequently

Accessed Data

“Cold” DataArchive and

Compliance Data

≥ 0 Days> 0KStarts at $0.023 / GB per month

1-5 mins

$0.01/GB retrieval

Storage Tiered To Your Requirements

S3-IA

Glacier

S3

3 new retrieval options

3–5 hrs 5–12 hrs

Expedited Standard Bulk$0.03 / GB $0.01 / GB $0.0025 / GB

S3 Storage Management Features New!

S3 Object Tagging - manage and control access for Amazon S3 objects. • Object Tags are key-value pairs applied to S3 objects which can be created, updated or deleted at any time during the lifetime of the object.

– provide the ability to create Identity and Access Management (IAM) policies, setup S3 Lifecycle policies, and customize storage metrics.

– manage transitions between storage classes and expire objects in the background.

S3 Analytics, Storage Class Analysis - you can analyze storage access patterns and transition the right data to the right storage class. • automatically identifies the optimal lifecycle policy to transition less frequently accessed storage to SIA. • configure a storage class analysis policy to monitor an entire bucket, a prefix, or object tag. Once an infrequent access pattern is observed,

easily create a new lifecycle age policy based on the results. • provides daily visualizations of your storage usage in the AWS Management Console that can be exported to an S3 bucket to analyze using

the business intelligence tools of your choice, such as Amazon QuickSight.

S3 Inventory – simplify and speed up business workflows and big data jobs• provides a CSV (Comma Separated Values) flat-file output of your objects and their corresponding metadata on a daily or weekly basis for

an S3 bucket or a shared prefix.

S3 CloudWatch Metrics - understand and improve the performance of your applications that use S3 • monitoring and alarming on 13 new S3 CloudWatch Metrics• receive 1-minute CloudWatch Metrics, set CloudWatch alarms, and access CloudWatch dashboards to view real-time operations and

performance such as bytes downloaded and the 4xx HTTP response count of your Amazon S3 storage. • For web and mobile applications that depend on cloud storage, these let you quickly identify and act on operational issues.

Amazon Glacier• Low-Cost Archival Storage• Secure

• SSL & AES-256

• Durable• Designed for 99.999999999% durability

• Optimized for data archiving and backup• Suitable for RTO measured in hours• Includes storage costs and retrieval costs

• Three retrieval options: Expedited, Standard, Bulk • As little as $0.004 per GB/Month (US East pricing)• Integrated with S3

Amazon CloudFront• Easy-to-use Content Delivery Network (CDN)• Pay-as-you-go pricing• Multiple origins: S3, EC2, on-premise

• Worldwide network of 68+ edge locations• Video streaming• Geo Restriction• Custom SSL Certificates• Dynamic Content• POST/PUT

4On-Premises

Storage Integration

Storage Gateway hybrid storage solutionsEnables using standard storage protocols to access AWS storage services

AWS StorageGateway

Amazon EBS snapshots

Amazon S3

Amazon Glacier

AWS Identity and Access Management (IAM)

AWS Key Management Service (KMS)

AWS CloudTrail

Amazon CloudWatch

Files

Volumes

Tapes

Storage Gateway – Files, volumes, and tapes

File gateway NFS (v3 and v4.1) interfaceOn-premises file storage backed by Amazon S3 objects

Tape gateway iSCSI virtual tape library interfaceVirtual tape storage in Amazon S3 and Glacier with VTL management

Volume gateway iSCSI block interfaceOn-premises block storage backed by S3 with EBS snapshots

Storage Gateway – Common capabilities

Standard storage protocols integrate with on-premises applications

Local caching for low-latency access to frequently used data

Efficient data transfer with buffering and bandwidth management

Native data storage in AWS

Stateless virtual appliance for resiliency

Integrated with AWS management and security

File gatewayOn-premises file storage maintained as objects in Amazon S3

Customer Premises

FileGateway

• Data stored and retrieved from your S3 buckets• One-to-one mapping from files-to-objects• File metadata stored in object metadata• Bucket access managed by IAM role you own and manage• Use S3 Lifecycle Policies, versioning, or CRR to manage data

GlacierS3 Standard

S3 Standard -Infrequent

Access

HTTPSNFSv3 / v4.1

Application Server

Application Server

Volume gatewayOn-premises volume storage backed by Amazon S3 with EBS snapshots

• Block storage in S3 accessed via the volume gateway• Data compressed in-transit and at-rest• Backup on-premises volumes to EBS snapshots• Create on-premises volumes from EBS snapshots• Up to 1PB of total volume storage per gateway

Amazon EBS

snapshots

Storage Gatewaybucket in

Amazon S3

Customer Premises

VolumeGateway

iSCSI HTTPS

Tape gatewayVirtual tape storage in Amazon S3 and Glacier with VTL management

• Virtual tape storage in S3 and Glacier accessed via tape gateway• Data compressed in-transit and at-rest• Unlimited virtual tape storage, with up to 1PB of tapes active in library• Supports leading backup applications:

Archived Tapes stored in

Amazon Glacier

MED

IA

CH

ANG

ERTA

PE

DR

IVE

Customer Premises

TapeGateway

Virtual Tapesstored in

Amazon S3BackupServer

HTTPSiSCSI

Hybrid storage use cases with Storage Gateway

Enabling cloud workloadsMove data to AWS storage for Big Data, cloud bursting, or migration

Tiered cloud storageEasily add AWS storage to your on-premises environment

Backup, archive, and disaster recoveryCost effective storage in AWS with local or cloud restore

Storage Gateway – Key Benefits

Seamless integration across standard storage protocols

Low-latency access

Durability, cost, and elasticity of AWS Storage services

Efficient data transfer

Data encryption

Integrated with AWS monitoring, management, and security

Amazon Snowball & Snowball Edge

• Petabyte scale data transport• Uses secure appliances• Economic and fast• Faster than Internet for significant data sets• Import into S3• HIPAA Compliant New

What is Snowball? Petabyte scale data transport

E-ink shipping label

Ruggedizedcase

“8.5G Impact”

All data encrypted end-to-end

80 TB10G network

Rain & dust resistant

Tamper-resistant case & electronics

How it works

How fast is Snowball?

• Less than 1 day to transfer 250TB via 5x10G connections with 5 Snowballs, less than 1 week including shipping

• Number of days to transfer 250TB via the Internet at typical utilizations

InternetConnectionSpeedUtilization 1Gbps 500Mbps 300Mbps 150Mbps

25% 95 190 316 63250% 47 95 158 31675% 32 63 105 211

Amazon Snowmobile• Exabyte-scale data transfer service• Each Snowmobile can transfer up to 100PB• Delivered to your site like a container• Connects to your network via removable high-speed network

switch• Appears as network-attached data store• Once connected secure, high speed data transfer begins• After data transfer, Snowmobile driven back to AWS and data is

loaded into AWS service you select e.g. S3, Redshift, Glacier

Using Multiple Storage Options Together

• EBS + S3: snapshots

• S3 + EC2 Instance Store: caching

• S3 + CloudFront: edge caching

• S3 + Glacier: data lifecycle archiving

It’s all aboutchoice

Performance-orientedCost-oriented

Any Questions?

Appendix

Amazon Athena(GA: US East (N. Virginia) and US West (Oregon) )

• Interactive Query Service that makes it easy to analyze data in Amazon S3 using standard SQL

• Interactive query service

• Analyze data directly in Amazon S3

• Use standard (ANSI) SQL

• No ETL required

• Fast performance. Scales automatically

• Serverless. Zero infrastructure. Zero admin

• Pay only for the queries you run

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon Aurora with PostgreSQL compatibility (Preview)

• Full PostgreSQL compatibility and up to twice the performance• All the features of Amazon Aurora

• Availability: failover time of < 30 seconds

• Durability: 6 copies across 3 Availability Zones

• Read Replicas: single-digit millisecond lag times on up to 15 replicas

• Cloud-native security and encryption with AWS KMS, IAM, etc.\

• Easy migration using with AWS Database Migration Service and AWS Schema Conversion Tool

Introducing AWS Snowball Edge

Lambda function

Lambda Functions On-board

Snowballclusters

S3 compatible endpoint,

NFS mount point

100 TB of capacity

Petabyte–scale data transport with on-board compute

AWS Snowball Edge Use Cases

Extension of your data center

Process data Expedites move

Encrypted, secure, and embedded

compute

Write data directly as data is generated

Offers a fast and cost effective way to ensure data can be

quickly transferred to and from the cloud

Simplifies data transfer

Uses standard and familiar tools

for the data transfer process

Introducing AWS Snowmobile• 45-foot long ruggedized shipping container

• Up to 100PB of capacity

• Load data S3 or Glacier

• Dedicated security personnel, GPS tracking,

alarm monitoring, 24/7 video surveillance,

and optional escort security while in transit

• Data encrypted with 256-bit encryption keys,

managed through KMS

AWS Snowmobile Use Cases• Move storage to cloud (images, media files, archives)

• Data center shut down• Available to customers in US only• Each engagement will have customized pricing

based on:• Volume of data the customer would like to migrate• Data center set up• Duration of use• Published pricing guideline will be $0.005/GB per month