storage with amazon s3 and amazon glacier
TRANSCRIPT
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
26 October 2016
Storage with Amazon S3 and Amazon Glacier
Darryl S. Osborne – AWS Storage Specialist Solutions Architect
Agenda
• AWS Storage is a Platform• Amazon S3 - Object storage• Amazon Glacier - Archive storage• Data transfer options• Content distribution• Case studies• Q&A
Amazon EFS
File
Amazon EBS Amazon EC2Instance Store
Block
Amazon S3 Amazon Glacier
Object
Data Transfer
AWS Direct Connect
AWS Snowball
ISV Connectors Amazon Kinesis Firehose
S3 Transfer Acceleration
Storage Gateway
Storage is a platform: AWS Storage Maturity
Amazon S3 Amazon Glacier
Object
Object storage is foundational
LambdaEC2 EMR Data Pipeline Kinesis
CloudFront RDS DynamoDB RedShift
Database
AnalyticsCompute
Elastic Transcoder
Content Delivery
Value in Every GB
Scale
Durability
Cloud Data Migration
Lifecycle Management
Broad Integration with other AWS services
Amazon S3 – Object storage
What is Amazon S3Highly durable object storage for all types of data
Internet-scale storage Grow without limits
Low price per GB per monthNo commitmentNo up-front cost
Built-in redundancyDesigned for 99.999999999% durability
Benefit from AWS’s massive security investments
Key Features of Amazon S3
Data Management• Cost monitoring and controls• Lifecycle management
Ease of use• Programmatic access using AWS SDKs• REST APIs• Management Console, AWS CLI
Event Notifications• Delivered using SQS, SNS, or Lambda• Enable you to trigger workflows, alerts or
other processing
Data protection• Versioning• Cross-region replication
Security• Multi-factor authentication delete• Flexible access control mechanisms• Time-limited access to object• Access logs• Multiple client and server-side
Encryption options
Cross-region replication
Amazon CloudWatch& AWS CloudTrail support
VPC endpoint for Amazon S3
Read-after-write consistency in all regions
Event notifications
Amazon S3 bucket limit increase
Innovation for Amazon S3
Amazon S3 Standard-IA
TransferAcceleration
Expired object delete marker
Incomplete multipart
upload expiration
Lifecycle policy
Innovation for Amazon S3, continued…
Active data Archive dataInfrequently accessed data
S3 - Standard S3 – StandardInfrequent Access
Glacier
Choice of storage class on Amazon S3
Lifecycle
AvailableS3: 99.99%
S3-IA: 99.9%
PerformantLow Latency
High Throughput
SecureSSE, client
encryption, IAM integration
Event NotificationsSQS, SNS, and
Lambda
VersioningKeep multiple
copies automatically
Cross Region Replication
CommonNamespaceDefine storage class per object
Durable99.999999999%
ScalableElastic capacity No preset limits
“Hot” DataActive and/or
Temporary Data
“Warm” DataInfrequently
Accessed Data
“Cold” DataArchive and
Compliance Data
S3-IA
Glacier
S3
Storage tiered to your requirements
Lifecycle
AvailableS3: 99.99%
S3-IA: 99.9%
PerformantLow Latency
High Throughput≥ 30 Days≥ 128K
≥ 90 Days
Durable99.999999999%
ScalableElastic capacity No preset limits
> 0K$0.007/GB per month
$0.0125/GB per month
“Hot” DataActive and/or
Temporary Data
“Warm” DataInfrequently
Accessed Data
“Cold” DataArchive and
Compliance Data
≥ 0 Days> 0K$0.03/GB per month
3 – 5 Hrs
$0.01/GB retrieval
$0.01/GB retrieval > 5%
S3-IA
Glacier
S3
Storage tiered to your requirements
S3
S3-IA
Glacier
10-% Reads 90+% Writes Use caseUser files become dormant days after upload. The access pattern is usually 90+% writes and 10-% reads.
BenefitsLower costs with minimal integration.
Assuming 90/10 access ratio:
$0.0125/GB + $0.001 (retrievals) = $0.0135/GB
User Generated Content Example
Active Archive Example
S3
S3-IA
Glacier
On-Demand Reads
Lifecycle
Active Data
Deep Archive
Use caseData reads from archive are infrequent but require immediate response. Data is archived for future reference or compliance and often resides on tape.
The optimal tier for deep archives is Glacier. S3-IA can be an intermediate phase into Glacier.
Customer valueImprove access to valuable content, reduce costs and improve durability.
Example applicationsDigital media archives Intermediate log archives for Big Data Analytics
S3-IA
Glacier
Active Backup
Long-term Backup
Lifecycle
SGWUse caseBackup and archive on-premises data or EC2 data volumes to AWS directly from backup applications or through a gateway.
Customer valueReduce costs, simplify management, infinite scale compared to on-prem tape/disk
Enterprise Backup Example
• Preserve, retrieve, and restore every version of every object stored in your bucket
• S3 automatically adds new versions and preserves deleted objects with delete markers
• Easily control the number of versions kept by using lifecycle expiration policies
• Easy to turn on in the AWS Management Console
Key = photo.gifID = 121212
Key = photo.gifID = 111111
Versioning Enabled
PUTKey = photo.gif
Amazon S3 Versioning
Delivers notifications to Amazon SNS, Amazon SQS, or AWS Lambda when events occur in S3
S3
Events
SNS topic
SQS queue
Lambda function
Notifications
Notifications
Notifications
Foo() {…}
Amazon S3 Event Notifications
Automated, fast, and reliable asynchronous replication of data across AWS regions
Source(Virginia)
Destination(Oregon)
• Only replicates new PUTs. Once S3 is configured, all new uploads into a source bucket will be replicated
• Entire bucket or prefix based• 1:1 replication between any 2
regions• Versioning required
Use cases:• Compliance—store data hundreds of miles apart• Lower latency—distribute data to regional customers)• Security—create remote replicas managed by separate AWS accounts
Amazon S3 Cross-region Replication
Amazon S3 Virtual Private Endpoint (VPCE)
Prior to S3 VPCE Using S3 VPCE
• Public IP on EC2 Instances and IGW• Private IP on EC2 Instances and NAT
• Access S3 using S3 Private Endpoint (VPE) without using NAT instances or Gateways
• Increased security
Amazon S3Amazon S3
Client-side encryption use AWS SDKs• You manage the encryption keys and never send them to AWS
Server-side encryption (SSE) with Amazon S3 managed keys• “Check-the-box” to encrypt your data at rest. Keys managed by S3
SSE with customer provided keys• You manage your encryption keys and provide them for PUTs and GETS
SSE with AWS Key Management Service managed keys• Keys managed centrally in AWS KMS with permissions and auditing of usage
For more details – watch Encryption and Key Management in AWS: https://www.youtube.com/watch?v=uhXalpNzPU4
Amazon S3 Data Encryption Options
Amazon S3 Availability & Usage
Amazon S3 holds trillions of objects and regularly peaks at millions of requests per second.
Available in 14 regions today and4 new regionscoming soon.
1 PB raw storage
800 TB usable storage
600 TB allocated storage
400 TB application data
pay only for what you use!Traditional storage Amazon S3
Amazon S3 Capacity Pricing
Pay only for what you use.
There is no minimum fee.
We charge less where ourcosts are less, and pricesare based on the locationof your Amazon S3 bucket.
Estimate your monthly bill using the AWS Simple Monthly Calculator.
Amazon S3 Price
Amazon Glacier – Archive storage
Archival storage for infrequently accessed data
Amazon Glacier is optimized for
infrequent retrieval
Stop managing physical media
Even lower cost than Amazon S3;
Same high durability
3-5 hour retrieval latency
%5 free tier on retrievals
$0.007 per GB/month
$86 per TB/year
Replace tape libraries, VTLs
What is Amazon Glacier
Key Features of Amazon Glacier
Vault Inventory• Inventory all archives• Available as JSON or CSV
Ease of use• Programmatic access using AWS SDKs• REST APIs• Management Console, AWS CLI
Data Retrieval Policies• Define data retrieval limits and cost
ceiling• Example: ”Free Tier Only”, “Max
Retrieval Rate”,
Access Controls• Integrated with AWS IAM• Supports MFA device access
Integrated Lifecycle Management• Integrated with Amazon S3 Lifecycle
policies• Establish auto-archive rules for
Amazon S3 objects
Tagging Support• Tag vaults for cost management• Filter cost reports based on tags
Innovation for Amazon Glacier
Audit Logs
Vault Lock Vault Access Policies
Three Ways to Ingest Data with Amazon Glacier
•Direct Glacier API/SDK• Direct access to Glacier for deep archives
•S3 lifecycle integration• Move older data to less expensive archive
tier
•Third party tools and gateways• Integrate existing backup and archive
applications using an IT-friendly interface
Data Transfer Options
AWS Data Ingest Options
AWS Direct Connect
AWS Snowball
ISV Connectors
Amazon Kinesis Firehose
Storage Gateway
S3 Transfer Acceleration
Content Distribution
AWS provides full-site, or media asset, delivery via a worldwide content delivery network (CDN) called Amazon CloudFront.
Amazon CloudFront Edge Locations
Amazon S3Bucket
EdgeLocation
EdgeLocation
EdgeLocation
EdgeLocation
EdgeLocation
3
3
2
2Edge
Location
EdgeLocation
- Amazon S3 can be used as durable origin for global content distribution
- Provides single origin for multiple CDNs, such as Amazon CloudFront
- Data transfer out of Amazon S3 into Amazon CloudFront is free!
- Optimal for serving static web assets such as images, videos and HTML
Single origin storage for content distribution
Case Studies
SoundCloud: Audio Transcoding
- World’s leading social sound platform- Audio files must be transcoded and stored in
multiple formats- Stores petabytes of data- Transcoded files served from Amazon S3 via
Amazon CloudFront- Originals moved to Amazon Glacier for cost
savings
Druva InSync SaaS: Endpoint Data Protection
Druva inSync Cloud relies on:- Amazon EC2- Amazon S3- Amazon DynamoDB
Gateway/NASData
Management Sync & ShareBackup/DRContent andAcceleration
Archive File System
Amazon Storage Partner Ecosystem
What’s next?
Getting started with S3 and Glacier:http://aws.amazon.com/s3/getting-started/http://aws.amazon.com/glacier/getting-started/
Pricing:http://aws.amazon.com/s3/pricing/http://aws.amazon.com/glacier/pricing/http://calculator.s3.amazonaws.com/index.html
AWS Youtube channel:https://www.youtube.com/user/AmazonWebServices/playlists