[rightscale webinar] architecting databases in the cloud: how rightscale does it
DESCRIPTION
Your database is the foundation of your application. With cloud comes new advantages and considerations for architecting and deployment. Find out how RightScale uses SQL and NoSQL databases such as MySQL, MongoDB, and Cassandra to provide a scalable, distributed, and highly available service around the globe.TRANSCRIPT
ARCHITECTING DATABASES FOR SCALABILITY &
AVAILABILITY IN THE CLOUD:
HOW RIGHTSCALE DOES IT
• Josep Blanquer, Chief Architect, RightScale
• Raphael Simon, Senior Systems Architect, RightScale
• Ali Khajeh-Hosseini, Director of Development, RightScale
Q&A
• Ben Ingalls, Sales Development Representative, RightScale
Please use the “Questions” window to ask questions at any
time
Your Panel Today
2
• Main Technologies Used
• Data Storage and Design for:
• Cloud Management
• Self Service
• Cloud Analytics
• Conclusions
• Q&A
Agenda
3
• RightScale uses a mix of RDBMS and NoSQL technologies:
• MySQL , Cassandra, MongoDB, Redshift and S3
• The choice for each of them is commonly due to features such as:
• Transactionality
• Availability
• Sharding
• Queryiability
• Raw performance
• Etc…
Intro: Tools and Technologies
• Strong ACID properties
• Availability through async replication (for “HA” and DR)
• Read scalability through multiple slaves
• Powerful SQL “queryiability”
• Examples of data from our Cloud Management product:
• Users, Plans, Settings
• Published marketplace assets
• Local assets like:
• ServerTemplates, Scripts
• Deployments and server configurations
• Alert definitions
Strong Points: MySQL
• High-availability properties
• Distributed, master-less
• Easy to horizontally scale (automatic data sharding and rebalancing)
• Tunable replication (including multi-DC)
• Tunable consistency
• TTL (Time To Live) in data elements
• Examples from our Cloud Management product:
• Events
• Audits
• Across-cloud message routing
• Session data
• Tags
Strong Points: Cassandra
• Mostly offline data retrieval
• Large scale and availability
• Large amounts of data
• When no querying is necessary
• Examples from our Cloud Management product
• Archived audits (encrypted)
• Scraped git repositories
• Archived monitoring data
Strong Points: S3
• Document oriented storage
• Built-in replication support
• Built-in sharding support
• Test and set query
• Examples from our Self Service product
• Cloud Application Templates (CATs)
• Catalog Applications
• Running Applications
Strong Points: MongoDB
• Simple to get started and manage
• Scales to handle up to a petabyte of data
• Powerful SQL “queryiability”: we can explore the data easily
• Examples from our Cloud Analytics product
• Storing years of usage, cost and pricing data, e.g.:
• Instance-id-1 with x, y, z params, launched on T1 and terminated at T2
• Price of instance-type-X with x, y, z params at T1 was $0.01
Strong Points: Redshift
• Let’s take a peek at:
• How the data storage architecture is designed
• How some of these these technologies are deployed
• With examples in each of our three main products:
• Cloud Management
• Self Service
• Cloud Analytics
Storage Architecture and Deployment
10
Streamline Operations
Streamline operations
RightScale Cloud Management
• Unify management of
compute, storage, and
network
• Design portable, multi-
cloud service
configurations
• Orchestrate large globally
distributed systems
• Control access across
clouds, data centers, and
tenants
11
For a single account Global, to all accounts
Data Accessibility and Scope
Use
rs
Inst
ance
s
Data
required b
y
Use
rs
Inst
ance
s Account X-Account
Use
rs
Inst
ance
s Account X-Account
global
Custom replication
Why custom? More control • Multiple sources • Individual columns • Apply transformations • Smart re-sync features
Global: MySQL • ACID semantics • Master-Slave replication
Use
rs
Inst
ance
s Account X-Account
global dash
S3
events
tags
audit
Dashboard: MySQL • ACID semantics • Master-SlaveN replication • Slave reads • Rows tagged by account
Other systems: Cassandra • Simpler Key-Value access • Great scalability • Great replica control • High write availability • Time-to-live expiration as cache • Rows tagged by account
Data archive: S3 • Low read rate • Globally accessible
Use
rs
Inst
ance
s Account X-Account
global dash
S3
events
tags
audit dash
events
tags
audit
So we can horizontally scale our dashboard by partitioning objects based on account groups:
Clusters
Use
rs
Account
Clu
ster
1
dash
S3
events
tags
audit
Clu
ster
N
dash
S3
events
tags
audit
Account Set 1 Account Set 2
RightScale Accounts
Clu
ster
3
dash
S3
events
tags
audit …
Features: • 1 cluster: N accounts
• 1 account: 1 home
• Migratable accounts
Benefits: • Great horizontal growth
• Better failure isolation
• Independent scale
• Load rebalancing
• Versionable code
• Differentiated service
Use
rs
Inst
ance
s Account X-Account
dash
events
tags
audit global dash
S3
events
tags
audit
routing
polling
monitor
Use
rs
Inst
ance
s Account X-Account
dash
events
tags
audit global dash
S3
events
tags
audit
routing
polling
monitor
routing
polling
monitor
And partition our cloud objects based on the cloud the instances of an account run on:
Islands
Inst
ance
s Account
Cloud 1 Cloud 2 Cloud N
Services co-located
with resources Services co-located
with resources
Services co-located
with resources
routing
polling
monitor
Isla
nd
1
Isla
nd
2
Isla
nd
N
routing
polling
monitor
routing
polling
monitor
routing
polling
monitor
routing
polling
monitor
routing
polling
monitor
Isla
nd
1
Isla
nd
2
Isla
nd
N
Polling Clouds: MySQL • Master-Slave replication • Can port to NoSQL easily • Mostly a resource cache • But cloud partitionable
Monitoring: Custom • Replicated files • Backup to S3 • Archive to S3
Routing: Cassandra • Simpler Key-Value access • Very high availability • Great scalability • Great replica control • Plus cross DC replication*
Use
rs
Inst
ance
s Account
Clu
ster
1
dash
S3
events
tags
audit
Clu
ster
N
dash
S3
events
tags
audit
Clu
ster
3
dash
S3
events
tags
audit …
routing
polling
monitor
routing
polling
monitor
routing
polling
monitor
Isla
nd
1
Isla
nd
2
Isla
nd
N
Different Geographies
Different Clouds
What if the cloud where the cluster is deployed on…
Fails?
22
Use
rs
Inst
ance
s Account
Clu
ster
1
dash
S3
events
tags
audit
Clu
ster
N
dash
S3
events
tags
audit
Clu
ster
3
dash
S3
events
tags
audit …
routing
polling
monitor
routing
polling
monitor
routing
polling
monitor
Isla
nd
1
Isla
nd
2
Isla
nd
N
Sister Clusters
Full replica
Features: • Each master has an extra remote slave
• Each cluster in a pair is a DC replica of the other’s
localring
At Disaster Recovery time: • Apps are told to start serving an extra shard
• No need to provision more infrastructure to recover
(try to avoid since everybody is on the same boat)
• New resources can be allocated over time to help
offload existing ones
Increase innovation
• Reduce development
cycles and increase agility
• Eliminate manual work with
automation and
orchestration
• Drive down spend with
built-in cost controls
• Reduce risks with policy-
based governance
RightScale Self-Service
23
• Self-Service deals with documents (CATs)
• AngularJS application built on top of REST API
• JSON compatibility
• High availability and good scalability with “test and set” building block
query
• No built-in join but not needed
• Use case allows for heavy use of denormalization
• praxis-mapper for efficient client side joins
Why MongoDB?
24
• 3 nodes MongoDB replica
set per shard
• Each replica in its own AZ
• Security groups for access
control
• Write concern of 2
• Apps read from master
(need consistency)
• BI, internal tools read from
slaves
Self-Service HA (today)
25
• Hidden replica in different
region (application does
not send requests to
hidden replicas)
• Deployments in VPC
• VPN between regions
Self-Service DR (EOY)
26
Optimize Cloud Spend
• Optimize cloud
spend
RightScale Cloud Analytics
• Visualize all your cloud
costs
• Forecast, budget, and
optimize cloud costs
• Optimize your spend and
reduce waste
• Implement chargeback and
showback with automated
reports
27
Cloud Analytics and Redshift
28
Data sources Data sources
Data sources
Data fetching jobs
CSV files on S3
Redshift cluster 1
Redshift cluster 2
Redshift cluster N
Servers that read and process data
Data load jobs Write to all clusters
Randomly pick one
cluster and read from it Servers that read and process data
Servers that read and process data
• Each Redshift clusters is deployed in one availability zone, what if that AZ has
issues, or if the cluster goes offline?
• Our architecture makes it easy to have replicas as there is a single “data stream”
of changes, which can be written to all clusters
• Sacrificed consistency across clusters for increased availability and scalability
• If one AZ has issues:
• Writes to clusters get delayed until the AZ is online or we take the affected
cluster offline
• Reads from clusters continue to work as servers can connect to another
cluster
• We run a “create replica” rake task that stops all the writes, takes a snapshot
from a working cluster, and creates a new cluster on a different AZ
Redshift HA
29
• Redshift supports a “copy snapshot to different region” functionality
• A new cluster can be created from a snapshot
• Cluster configs are not stored in the snapshot and need to be configured
• EC2 instances connect to Redshift using security groups, but the instances
and the cluster must be in the same region for the security groups to work
• We use Cloud Management’s monitoring system to monitor health and other
metrics of clusters, and alert on them
Redshift DR
30
“Shown how RightScale uses several database
technologies”
• For well-known relational data: MySQL (with high replication)
• For archiving and blob storage we use S3
• For very High-Availability and geo-replication we use Cassandra
• For TTL support and fast writes we also use Cassandra
• For JSON documents we use Mongo (with sharding and replica-sets)
• For large data analytics we use AWS Redshift
Conclusions
31
• Start a Free Trial of RightScale Today
https://www.rightscale.com/free-trial
• Get the White Paper “Designing Private & Hybrid Clouds”
http://www.rightscale.com/lp/designing-private-hybrid-clouds-white-paper
Thank You and Q&A
32
THANK YOU.
33