scaling a mobile web app to 100 million clients and beyond (mbl302) | aws re:invent 2013

© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

Joey Parsons @joeyparsons

November 14th, 2013

Scaling a Mobile Web App to 100 Million Clients and Beyond

Friday, November 15, 13

YOUR PERSONAL MAGAZINE


The ultimate way to discover, consume & share content on the mobile, social web

Wednesday, June 5, 13Friday, November 15, 13

How are mobile apps different?• WiFi vs Slow connectivity

• Variances in bandwidth and global carriers

• Taking advantage of the local cache • Control your behavior during latency

• Fast devices — significant opportunity for client computation


Prototype Phase: From 0 to 1M users


- Amazon EC2- Amazon S3- Amazon RDS


The Initial Launch Night


Things we should have done…• Make sure to prepare for Amazon limits if you need to

scale quickly

• Make sure your external partners understand the volumes you’ll be accessing them


Challenges• Understanding the scale of our services• Little to no insight into performance• Beginning to build out tooling for Amazon EC2 but still

in its infancy• No centralized logging or way of detecting errors


Getting Started:From 1M to 10M Users


- Amazon EC2- Amazon RDS- Amazon S3 - Amazon CloudFront


Architecture Changes• Different services have different scale profiles — began

the shift towards microservices• Image content moved to CloudFront• Moved primary data store to MySQL via Amazon RDS• Home grown bash scripts for deploys• Focus on instrumentation

• Logging, Metrics, Monitoring followed suit


Host[i-76e33611] - Amazon Instance ID name[tsd01] - Name of the instance owner[ops] - Who owns the instance? service[OOS] - IS for in service, OOS for out of service ami[ami-3f622f56] - What AMI was used type[m1.xlarge] - Type of EC2 instance loc[us-east-1a] - Region and Availability Zone role[flops] - Type of role subclass[opentsdb] - Subclass of role group[0] - Group of node pool[production] - Production, staging, dev public[50.16.58.220] - Public IP address private[10.60.43.18] - Private IP Address

SimpleDB for CMDB


# fl-inst-describe -r flip -p production -g 0 -s IS -o ops

Domain[flipboard.prod.instances] has count[1] hosts meeting criteria=======================================Servers of role flip=======================================

Host[i-5b8ae323]: name[flip05] owner[ops] service[IS] public[54.226.44.212] private[10.78.167.211] role[flip] group[0] pool[production] subclass[standard] type[c1.xlarge]

Querying our CMDB


The iPhone Launch Night


Scaling Fast:10M to 100M Users


Storm Kafka GraphiteKibana


Architecture Changes• Heavy focus on instrumentation of all services• Pipeline of batch processing using Hadoop• Pipeline of real-time processing using Storm + Kafka• Keen focus on using appropriately sized EC2

instances• Moving off of bash scripts, moving to puppet


Mobile application instrumentation


All at once?

fl-inst-upgrade -r flip -p production-q

… or …

By group?

fl-inst-upgrade -r flip -p production -g 0 -q

Deploy by groups


Using CloudWatch metrics for errors


fl-inst-upgrade -r flip -p production -g 1 -q

Continued your deploy


Graphite for all metrics


Millions of metrics with Graphite


d3.js + cubism.js


Monitoring via CloudWatchAlarm in PagerDuty

Details available in PagerDuty


Lessons Learned• Use Amazon services when possible (Amazon RDS,

Amazon Redshift, Amazon Route 53)• Use SSDs where applicable• Understand your scale and your needs going forward

and invest in Reserved Instances (3 years!)• But, allow flexibility for changing needs and instance

types


Amazon Technologies Used• Amazon CloudFront• Amazon Route 53• Amazon EC2• Amazon S3• Amazon Redshift

• Amazon RDS• Amazon SimpleDB• Amazon SQS• ElastiCache• Amazon CloudWatch


Beyond:From 100M Users to 1B!


What’s next?• Better use of Auto Scaling groups• Predictive analytics — lots of signals• Automated remediation• Heavy focus on using the right instance types for each

service• Take advantage of new AWS products


The unknown is exciting …


Questions?


AWS re:Invent 2013Magazine

http://flip.it/NSNEi




Please give us your feedback on this presentation

As a thank you, we will select prize winners daily for completed surveys!

MBL302 Thank You


scaling a mobile web app to 100 million clients and beyond (mbl302) | aws re:invent 2013

Technology

amazon cloudfront amazon

amazon technologies

amazon limits

use amazon services

amazon instance id

possible amazon rds

amazon rds home

type of ec2 instance