building high scalable distributed framework on apache mesos
TRANSCRIPT
BuildingHighScalableFramework onApache Mesos
RAHULKUMAR
TECHNICALLEAD,SIGMOID
“A distributed system is a collection of independent computers that appears to its
users as a single coherent system.”
A distributed system application works independently and communication through messages.
q Resource Sharingq Opennessq Concurrencyq Scalabilityq Fault Toleranceq Transparency
MesosIntro
“Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and
run effectively.”
SoftwareProjectsBuiltonMesos
q Aurora isaserviceschedulerthatrunsontopofMesos,enablingyoutorunlong-runningservicesthattakeadvantageofMesos'scalability,fault-tolerance,andresourceisolation.
q Marathon isaprivatePaaSbuiltonMesos.Itautomaticallyhandleshardwareorsoftwarefailuresandensuresthatanappis“alwayson”.
q Spark isafastandgeneral-purposeclustercomputingsystemwhichmakesparalleljobseasytowrite.
q Chronos isadistributedjobschedulerthatsupportscomplexjobtopologies.Itcanbeusedasamorefault-tolerantreplacementforCron.
q ElasticSearch isadistributedsearchengine.Mesosmakesiteasytorunandscale.
CreateownFrameworkWhyweneedtowriteownframework:
q Customscheduling
q Autoscaling
q AdvancedTaskManagement
WhyMesosq TaskDistribution
q Launching,Monitoring,failuredetection
q Resourceisolation
q Containersupport
qMessagingbetweenTasks
qMakeStatedistributed
LanguageSupport
ProtocolBuffer
Protocolbuffersare usedextensivelyformessagingandserializationinsideMesosandwhendevelopingMesosframeworks.- TasksaredescribeinProtocolbuffermessage- Ithelpsittomakelanguageindependent
messageFrameworkInfo {requiredstringname=2;optionalFrameworkID id=3;optionaldoublefailover_timeout=4[default=0.0];optionalboolcheckpoint=5[default=false];optionalstringrole=6[default="*"];optionalstringhostname=7;optionalstringprincipal=8;optionalstringwebui_url =9;messageCapability{enum Type{UNKNOWN=0;REVOCABLE_RESOURCES=1;TASK_KILLING_STATE=2;}optionalTypetype=1;}repeatedCapabilitycapabilities=10;optionalLabelslabels=11;}
importorg.apache.mesos.Protos.FrameworkInfo
val framework=FrameworkInfo.newBuilder.setName(“SigApp”).setUser(”rahul").setRole("*").setCheckpoint(false).setFailoverTimeout(0.0d).build()
TheSchedulerThescheduleristhecomponentthatinteractsdirectlywiththeleadingMesosmaster
q Launchtasksonthereceivedoffersq Handlestatusupdatescheckingfortaskfailureandrestartq StatePersistandmanagefailovers
Yourframeworkschedulershouldinheritfromthe Scheduler class
schedulershouldcreateaSchedulerDriver ,whichwillmediatecommunicationbetweenyourschedulerandtheMesosmaster.
classSigScheduler() extendsScheduler {overridedef error(driver: SchedulerDriver, message:String) {}overridedef executorLost(driver:SchedulerDriver, executorId:ExecutorID,slaveId:SlaveID,status:Int){}overridedef slaveLost(driver:SchedulerDriver, slaveId:SlaveID){}overridedef disconnected(driver: SchedulerDriver) {}overridedef frameworkMessage(driver: SchedulerDriver, executorId:ExecutorID,slaveId:SlaveID,data:Array[Byte]){}overridedef statusUpdate(driver:SchedulerDriver, status:TaskStatus){println(s"received statusupdate$status")}overridedef offerRescinded(driver: SchedulerDriver, offerId:OfferID){}overridedef resourceOffers(driver: SchedulerDriver, offers:java.util.List[Offer]){//code}def submitTasks(tasks:String*)={this.synchronized {this._tasks.enqueue(tasks:_*)}}overridedef reregistered(driver: SchedulerDriver, masterInfo:MasterInfo){}
overridedef registered(driver: SchedulerDriver, frameworkId:FrameworkID,masterInfo:MasterInfo){}
TheExecutor
q ExecuterExecutingtasksasrequestedbythescheduler
q Keepingtheschedulerinformedofthestatusofthosetasks
q TaskManagement
val exec= new Executor {
override def launchTask(driver: ExecutorDriver, task: TaskInfo): Unit ={}
override def killTask(driver: ExecutorDriver, taskId: TaskID): Unit ={}
override def shutdown(driver: ExecutorDriver, taskId: TaskID): Unit ={}
}
MesosEndpoints
qHTTPendpointsavailableforagivenMesosprocess
http://master.com:5050/files/browse
http://master.com:5050/files/browse/files/debug
http://master.com:5050/files/browse
http://master.com:5050/api/v1/scheduler
http://master.com:5050/health
Thank YouQ/A