business track: building a private cloud to empower the business at goldman sachs
DESCRIPTION
TRANSCRIPT
Building a Private Cloud
to
Empower the Business at Goldman Sachs
What are we building with MongoDB?
SecureDocs
What is it?
GS employees secure ebriefcase
Access from mobile and traditional clients
What tech backs it?
MongoDB 2.2 and Apache Tomcat 7
Hardware load balancing
Why Mongo?
Completely user driven tagging structure
Out of the box HA
2 June 21, 2013 MongoNYC
What are we building with MongoDB?
Social PipeLine
What is it?
Internal social platform for quick information sharing
Real time analytics platform for external social trends
What tech backs it?
MongoDB 2.2, Apache Kafka, Solr and Apache Tomcat 7
Commodity hardware on all layers
Why Mongo?
Highly unstructured data across all possible social sources
Sharding and performance
3 June 21, 2013 MongoNYC
Why MongoDB?
Scale out For Performance and Size
Global Availability and Resiliency
Statement-Level Transaction and Consistency Semantics Strong Consistency Where Needed
Relaxed Consistency Where Possible
Easy to Use Powerful APIs
No ORM required
10gen
June 21, 2013 MongoNYC 4
Why MongoDB?
“Sweet-spot” Between Filesystems and Relational Database Security Model
Primary Keys and Secondary Indexes
Replication and Sharding
Highly Structured – but Not Enforced
June 21, 2013 MongoNYC 5
RDBMS
Challenges
We’re a bank …
June 21, 2013 MongoNYC 6
DaaS in a Private Cloud: Motivations
Facilitate Scale out For Performance and Size
Global Availability and Resiliency
Rapid Deployment + Development
Efficiencies and Economies of Scale
“Late Affinity” of purpose Platform
Version
Infrastructure Agility Spare hardware
On-boarding pipeline
Supply-side Inventory Management
Keep the platform “easy to use”
June 21, 2013 MongoNYC 7
DaaS in a Private Cloud: Challenges
Building for unknown use cases
Defining “shapes”
Database platform specific hardware pools
Virtualization + Shared tenancy
Performance and scale considerations
SSD Storage
Security and Controls
Integrated into on-boarding pipeline
Audit
Backups and Archive
Off-host / large footprint
Sensitive Data and Masking
Inventory Management
Location aware for geographic resiliency
June 21, 2013 MongoNYC 8
DaaS in a Private Cloud: Challenges
#1 Challenge CPU :: Memory :: Storage :: Price
Moving to “cloud” means limiting choice on these ratios
Scale out for storage May over-allocate compute
Scale out for compute May over-allocate storage
June 21, 2013 MongoNYC 9
Onboarding MongoDB @ Goldman Sachs
Before: MongoDB Cluster Topologies Not Standardized
DevOps Model, Informal User Groups
Informal 10gen Engagement
Various Versions of MongoDB
After: Private Cloud Service w/ Standardized Topologies
Fully Onboarded and Supported Database Platform
Formalized 10gen Relationship (via Database Group)
Standardize on MongoDB Enterprise Edition
June 21, 2013 MongoNYC 10
Engineering MongoDB for Private Cloud
Supply Flow
Provision Virtual Machine
Register Node as Available
Nodes are NOT configured for specific cluster
Demand Flow
User Orders Cluster Based on Primary
& Resiliency Region
Reserve Nodes from Available
Inventory
Perform operations to give node
“Personality”
Configuration
Seed First Node, Expand With
Others
Build Based on Inventory
Delivery of Cluster to Requestor
June 21, 2013 MongoNYC 11
MongoDB for Private Cloud
Topology
Required global topology for out of region resiliency (min = 3 nodes)
Each cluster is considered a “building block” for larger sharded clusters
MongoC and MongoS co-located with MongoD
Sharding
Teams encouraged to consider Shard Key even if no sharding plans
Sharding is the only supported way to grow (fixed internal storage)
Provided with a single Shard by default
Monitoring
&
Self Service
Custom Monitoring Stack
Ordering Automation and Developer Self-Service
June 21, 2013 MongoNYC 12
MongoDB for Private Cloud
Backup
Periodic backups to object storage
Working toward Point in Time Recovery (PITR)
Security
Kerberos ticket based authentication required
Authorization policies will continue to mature
Support
Database team supports service offering, only
Use cases that utilize “Sensitive Data” are not yet supported
June 21, 2013 MongoNYC 13
Private Cloud Challenges
Unlike Public Cloud
• We don’t profit from under utilization
• Our incentives are different
• This dramatically affects our scale out approach
One Size Fits All
• Sharding is primary strategy for growing
• Both small and large apps waste resources
• CPU/Memory vs. Storage
MongoDB Cloud Goals
• Utilize available resources efficiently
• Maintain customization expected by users
• Maintain ease of use
Ideal Shape Differs by App
• Small Impedance mismatch preferred if it enables scale
• Evaluate more shapes if mismatch is egregious
June 21, 2013 MongoNYC 14
Take a < $10,000 Machine, Split it 1,2,4,8 ways and Build MongoDB Service...
Looking Forward: Cloud-Oriented Feature Requests
• Multi-tenancy on shared data repositories Better
Security Models
• Addresses a broader array of use cases Enhanced
Multi Master
• Increase utilization of fixed storage Compression
• Object Storage Off-host Backups
• Address Shape Mismatches? Better Shard Sizing
• Introduce more “Named Resource” concepts Named Clusters
15 June 21, 2013 MongoNYC
Questions?
June 21, 2013 MongoNYC 16