scaling runa inc big data e-commerce service with aws
DESCRIPTION
Presentation given the first AWS Startup Event 4/14/2010. Describes how Runa was using AWS for its SaaS for e-commerce sitesTRANSCRIPT
Runa on AWSBig Data & Machine Intelligence for a SaaS
Startup
Runa
a
SaaS
converts Shoppers to Buyers
for
Online Commerce Sites
by presenting
Dynamic Personalized Promotions
on the
Merchant’s Website
in Real-Time
in the
Shopping Flow
Tech Challenges
Big Data
JavaScript client collects activity on
every Merchant page for every Shopper
One or more Ajax call & Event Store to Runa per
Merchant page view
Step function increase of calls
and stores as each new
Merchant added
We capture everything we
can and store it forever
Expecting to grow to
thousands of merchants
That’s a lot of Data
Processing Data with
Machine Intelligence
Batch Processing for Statistical Analysis
and Reports
Real-Time Rule based inserts of
Promotions
Why AWS for Runa?
At First (a couple years ago)
Not Much Money in the Bank
Didn’t Know exactly
what were making
Or exactly how we were going to do it
Prototyped with Ruby / Rails / MySQL
Then
Prototype became
Production
EC2 & AWS let us scale the prototype to
Beta Production
Flexibility to incrementally
refine service & infrastructure
Confidence we could scale
as we added Merchants
More Recently
Incrementally added next-gen
Tech & Full Production
Goal: Everything
Horizontally Scalable
Batch Processing & Infinite Storage
Map / Reduce & BigTable
via
Hadoop & HBase
Flexible Real-Time parallel processing
via
Clojure / Swarmiji
Opscode ChefManagement & Monitoring
Consumers onMerchant Websites
InternetAdmin & Merchant
Dashboard(Rails)
Runtime Rules
Merchant Info
Merchants
Internet
AnalyticsReporting
Monitor & Recovery
Data Collectors
Hadoop / HBaseMap / ReducePetabyte Store
Load Balancer
HTTP
SharedSessionMemory
HTTPDispatchers
RedisMem
CacheRedisMem
CacheRedisMem
CacheRedisMem
CacheRedisMem
Cache
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
HBaseHBase
Amazon S3Data Backup
9+ Amazon EC2 Instances
Amazon ElasticLoad Balancer
3+ Amazon EC2Instances
Cheshire / Swarmiji
Dynamic Runtime
Queue
AWS Elastic Load Balancer
Rails App ServersNginx / Unicorn EC2 m1.xlarge
MySQL Master / Slave
EC2 m1.xlarge EBS
Legacy Runtime
Rails App Nginx/UnicornMySQL Master/Slave EC2 m1.xlarge / EBS
Merchant Dashboard
EC2 m1.xlarge
HBase / Hadoop
EC2 m1.xlargeEBSRabbitMQ
Cheshire / SwarmijiRedis
EC2 m1.xlarge
Clojure Based RuntimeAWS Elastic
Load Balancer
EC2 m1.large
Opscode Chef
Monitoring
EC2 m1.large
All Deployed on
Deployment & Configuration Management
via Opscode Chef
Good Things
Able to Start Small
Then
GROW BIGGER
Having the flexibility to throw
“Hardware” at our Prototype got
us to market faster
Ability to launch test and staging
environments almost at will
“Hardware” as
“Software”
Living in “interesting”
times
Managing Complexity
lots of moving parts
Easy to launch a few instances
Impossible to manage
horizontal stacks “by hand”
Must have tool like
Opscode Chef
Chef automates deployment & puts it under
Revision Control
There’s going to be some blood
when using cutting edge tech
Lots of Learning Curves to
climb
Useful Monitoring is
hard but Critical
HBase on AWS may be dangerous
because of Hadoop namenode SPOF
EC2 bill can surprise you if you cavalierly
deploy multiple versions of
horizontally scalable environments
Could not do our startup without
AWS or lots more VC Funding