scalable on-demand hadoop clusters with docker and mesos
Post on 28-Jul-2015
1.524 Views
Preview:
TRANSCRIPT
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Andrew Nelson, Nutanix@vmwnelson http://virtual-hiking.blogspot.comChris Mutchler, VMware@chrismutchler http://virtualelephant.com
V
2
Agenda
New Approach for Hadoop Ops Infrastructure Resource Considerations Docker as the new “Unit of Work” Future Work
3
Last Year’s State of the Art
Self-service and multi-tenant Hadoop Elastic and decoupled infrastructure Extensible blueprinting
4
New Goals
Operationalize multiple frameworks Decoupled service architecture Flexible and developer-friendly form factor
5
Apache Mesos Introduction
Started at Berkeley Graduated to top level Apache project
2013
Commercial entity is Mesosphere https://github.com/apache/mesos/
7
Mesos as a Multi-TenantResource Pool
Source: https://github.com/mesos/myriad/blob/phase1/docs/how-it-works.md
8
Tools to Build and Scale
Serengeti, Vmware https://github.com/vmware-serengeti
BOSH, Pivotal https://github.com/cloudfoundry/bosh
Cloudify, Gigaspaces https://github.com/CloudifySource/cloudify
Cloudbreak, SequenceIQ https://github.com/sequenceiq/cloudbreak
9
Advantages for Ops
Mesos as a Resource Pool Multiple concurrent frameworks Decouple frameworks from resource pools
Compute Partitions on Mesos
10
Shared
Hadoop
Storm
Spark
Kafka
Hadoop Cassandra Storm Spark
Marathon
Cassandra
Siloed
12
Networking Services
Service Discovery Handled per framework Port range resource managed by Mesos slave For example, Marathon uses HAProxy for request routing
Per-container network monitoring Egress rate-limiting
13
Scheduling Options
Mesos scheduling Capacity Scheduler Fair Scheduler
Tenant scheduling examples Hadoop on Mesos Myriad (YARN) on Mesos
14
Dev Workflow
Code Repo / Registry Pull / Push / Commit / Run
Automated Builds Version tagging
Marathon CI / CD Dependencies Rolling restarts
15
Registry Services
Pluggable storage Webhooks Image control
Security Logging
Registry
Repository Repository
Image
Image
Image
16
Advantages for Developers
Interchangeable verbs for code<->containers Choice of framework to use as their PaaS Adopt microservices approach to app pipeline
17
Recommendations for Success
Start small, scale fast Use most appropriate framework for the job Think ahead, decouple Plan for rolling restart capacity up front
18
Gap Analysis
Be prepared to “look under the hood” Variable maturity and resiliency of the layers Networking Security
19
Where Are We Going Next
Scale and learn Container-focused OS Software-defined networking services Discover key performance and availability metrics
top related