(net302) delivering a dbaas using advanced aws networking
TRANSCRIPT
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ben Bromhead, Instaclustr
October 2015
NET302
Delivering a DBaaS Using
Advanced AWS Networking
Who am I?
• Ben Bromhead, CTO @ Instaclustr
What does Instaclustr do?
• Cassandra as a Service
• Managing 300+ instances
• 95% on Amazon Web Services
What to Expect from the Session
• Exploration of challenges faced delivering DBaaS
• How and when to use AWS networking features to solve
these challenges
• A (meandering) history of our AWS journey
Some basics
What is Cassandra?
• A scalable, highly available
OLTP database
• Inspired by the Dynamo
(Amazon) and the BigTable
(Google) papers
• Tunable consistency
• Clients aware of topology
What a Cassandra DBaaS
should look like:
• High throughput / low
latency
• Secure
• Easy
Our first attempt at multi-tenancy
How we first started:
• Multi-tenancy was done by deploying resources under our
customers’own AWS accounts
• Limited access IAM user
• Billing done via Amazon DevPay
Multi-tenancy and Cassandra
How we first started:
• Cassandra is a scale-out OLTP / operational database,
designed for use cases that grow beyond a single server
• No point trying to multi-tenant within Cassandra
• Other than app level, 99% of multi-tenant use cases don’t
make sense for a highly scalable DB like Cassandra
• Need to multi-tenant at the cluster level
Multi-tenancy by AWS account
US_EAST_1
Availability Zone B Availability Zone CAvailability Zone A
Cassandra Cassandra Cassandra
Customer 1
Customer N
… … …
Multi-tenancy by AWS account
Pros:
• Deployed in customer
account – access was
simple
• Billing was simple
Cons:
• Change over to VPC?
• No two AWS accounts are
the same
• Billing wasn’t flexible
• Customers would mess with
our stuff
• Unable to detect AZ
capacity
Multi-tenancy by VPC
Pros:
• Reduce support overhead
• Flexible billing
• Simplify AWS interface
Cons:
• Had to rewrite everything
• Had to do our own billing
• Already know our AZ
capacity
• Used this opportunity to
move across to using
VPCs… how connect?
Multi-tenancy by VPC
US_EAST_1
Availability Zone Availability ZoneAvailability Zone
Cassandra Cassandra Cassandra
Cassandra Cassandra Cassandra
Customer 2
Customer 1
Customer N
Multi-tenancy by VPC
Side effects include:
We now have lots and lots of VPCs
Multiple accounts to get around VPC hard limits…
When to multi-tenant with VPC
1. The service you provide is a network service
2. The service you provide is directly related to resource
consumption (CPU, RAM, etc.)
3. The service you deploy leverages a complex network
configuration (multi-region, multi-AZ)
Support connectivity from outside AWS
• Hybrid clusters that span cloud / private data centers
• Support multi-region Cassandra clusters
• Support developers connecting from their personal
machines
• Occasional service running in a different provider
Resulting requirement:
• Support connectivity from outside an AWS region
Luckily Cassandra is awesome…
• Cassandra natively understands NAT’d environments
• Deploy instances in a subnet with an IGW
• Public IP for every node
• Sprinkle in some security group magic and Cassandra
authentication
Problem solved!
Cassandra with public IPs
Cassandra Cassandra Cassandra
VPC subnet VPC subnet VPC subnet
security group
Internet
Gateway
Support Heroku customers
Heroku is a Platform as a Service that runs on top of AWS
– cannot dictate the IP it connects from
Resulting requirement:
• Support secure global ingress (aka, Allow All)
Cassandra with public IPs
Cassandra Cassandra Cassandra
VPC subnet VPC subnet VPC subnet
security group
Internet
Gateway
Cassandra with public IPs
Cassandra Cassandra Cassandra
VPC subnet VPC subnet VPC subnet
security group
Internet
Gateway
Luckily Cassandra is awesome…
Add 0.0.0.0/0 to the security group…
Cassandra supports client-to-node certificate
authentication
Problem solved!
Cassandra with public IPs
Cassandra Cassandra Cassandra
VPC subnet VPC subnet VPC subnet
security group
Internet
Gateway
When to support universal ingress
1. Your customers are unlikely to have a static IP
2. Complex / changing access patterns
3. Your service can support robust authentication
Support private connectivity within AWS
• Some customers think that accessing their database
over a public IP address is scary
• Not all applications have direct Internet access (app
layer tier)
• Easy to do with EC-2 Classic
Resulting requirement:
• Support access to Cassandra via private IP
Luckily AWS is awesome…
By the time we had started to look at VPCs as our
preferred environment, AWS had introduced the last
feature we needed:
• VPC peering
VPC peering – total control on both sides
US EAST 1
Instaclustr AWS account
Customer AWS account #1
Customer AWS account #2
security group
security group
VPC peering is our most used AWS feature
70% of our production clusters have one or more VPC
peering connections with other account.
• Critical to adoption within the enterprise
• Critical for multi-level architectures where app layer does
not have external egress
• Almost always need to educate the customer
• Still incur inter-AZ traffic charges
• Your us-east-1a is not the same as my us-east-1a
When to use VPC peering
1. Resources accessing your service are located in AWS.
2. You provide a service used by the app / DB tier.
Supporting complex / custom requirements
One crucial component of success with any XaaS business
is to ensure uniformity of customer accounts:
• Reduces support cost per account
• Ensures consistent experience across customers
• One-off solutions still haunt us
• But…one-off solutions have also won us accounts and
have been rolled into production features (eventually)
Leverage AWS components
We try to always leverage AWS components for one-off
solutions within customer VPCs:
• Primarily enabled by our VPC multi-tenanting approach
– does not impact other customers
• It’s always a proven and managed solution
• Easy to bring into the fold when we support it properly
Custom solutions: an example
A customer wants access to the underlying Cassandra data
files for data sovereignty and offline analytics.
• Luckily, we back up all snapshots to Amazon S3
• We didn’t want to write a whole snapshot access UI
and service for our website
• Instead, we just provided read-only IAM credentials to
the S3 bucket containing those snapshots
Custom solutions: a second example
A customer wants to migrate their existing on-premises
cluster to AWS/Instaclustr.
• No public IP access to their cluster
• Use AWS virtual private gateway to connect to their
concentrator
• Let Cassandra’s multi-dc support handle the data
sync...
Key takeaways
• Using a VPC per service simplifies multi-tenancy
• VPCs offer a number of connectivity options
• Ensure your service supports robust authentication
• VPC multi-tenancy allows custom connectivity and
functionality without impacting other customers