building high scalable distributed framework on apache mesos

Post on 15-Apr-2017

117 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BuildingHighScalableFramework onApache Mesos

RAHULKUMAR

TECHNICALLEAD,SIGMOID

“A distributed system is a collection of independent computers that appears to its

users as a single coherent system.”

A distributed system application works independently and communication through messages.

q Resource Sharingq Opennessq Concurrencyq Scalabilityq Fault Toleranceq Transparency

MesosIntro

“Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and

run effectively.”

SoftwareProjectsBuiltonMesos

q Aurora isaserviceschedulerthatrunsontopofMesos,enablingyoutorunlong-runningservicesthattakeadvantageofMesos'scalability,fault-tolerance,andresourceisolation.

q Marathon isaprivatePaaSbuiltonMesos.Itautomaticallyhandleshardwareorsoftwarefailuresandensuresthatanappis“alwayson”.

q Spark isafastandgeneral-purposeclustercomputingsystemwhichmakesparalleljobseasytowrite.

q Chronos isadistributedjobschedulerthatsupportscomplexjobtopologies.Itcanbeusedasamorefault-tolerantreplacementforCron.

q ElasticSearch isadistributedsearchengine.Mesosmakesiteasytorunandscale.

CreateownFrameworkWhyweneedtowriteownframework:

q Customscheduling

q Autoscaling

q AdvancedTaskManagement

WhyMesosq TaskDistribution

q Launching,Monitoring,failuredetection

q Resourceisolation

q Containersupport

qMessagingbetweenTasks

qMakeStatedistributed

LanguageSupport

ProtocolBuffer

Protocolbuffersare usedextensivelyformessagingandserializationinsideMesosandwhendevelopingMesosframeworks.- TasksaredescribeinProtocolbuffermessage- Ithelpsittomakelanguageindependent

messageFrameworkInfo {requiredstringname=2;optionalFrameworkID id=3;optionaldoublefailover_timeout=4[default=0.0];optionalboolcheckpoint=5[default=false];optionalstringrole=6[default="*"];optionalstringhostname=7;optionalstringprincipal=8;optionalstringwebui_url =9;messageCapability{enum Type{UNKNOWN=0;REVOCABLE_RESOURCES=1;TASK_KILLING_STATE=2;}optionalTypetype=1;}repeatedCapabilitycapabilities=10;optionalLabelslabels=11;}

importorg.apache.mesos.Protos.FrameworkInfo

val framework=FrameworkInfo.newBuilder.setName(“SigApp”).setUser(”rahul").setRole("*").setCheckpoint(false).setFailoverTimeout(0.0d).build()

TheSchedulerThescheduleristhecomponentthatinteractsdirectlywiththeleadingMesosmaster

q Launchtasksonthereceivedoffersq Handlestatusupdatescheckingfortaskfailureandrestartq StatePersistandmanagefailovers

Yourframeworkschedulershouldinheritfromthe Scheduler class

schedulershouldcreateaSchedulerDriver ,whichwillmediatecommunicationbetweenyourschedulerandtheMesosmaster.

classSigScheduler() extendsScheduler {overridedef error(driver: SchedulerDriver, message:String) {}overridedef executorLost(driver:SchedulerDriver, executorId:ExecutorID,slaveId:SlaveID,status:Int){}overridedef slaveLost(driver:SchedulerDriver, slaveId:SlaveID){}overridedef disconnected(driver: SchedulerDriver) {}overridedef frameworkMessage(driver: SchedulerDriver, executorId:ExecutorID,slaveId:SlaveID,data:Array[Byte]){}overridedef statusUpdate(driver:SchedulerDriver, status:TaskStatus){println(s"received statusupdate$status")}overridedef offerRescinded(driver: SchedulerDriver, offerId:OfferID){}overridedef resourceOffers(driver: SchedulerDriver, offers:java.util.List[Offer]){//code}def submitTasks(tasks:String*)={this.synchronized {this._tasks.enqueue(tasks:_*)}}overridedef reregistered(driver: SchedulerDriver, masterInfo:MasterInfo){}

overridedef registered(driver: SchedulerDriver, frameworkId:FrameworkID,masterInfo:MasterInfo){}

TheExecutor

q ExecuterExecutingtasksasrequestedbythescheduler

q Keepingtheschedulerinformedofthestatusofthosetasks

q TaskManagement

val exec= new Executor {

override def launchTask(driver: ExecutorDriver, task: TaskInfo): Unit ={}

override def killTask(driver: ExecutorDriver, taskId: TaskID): Unit ={}

override def shutdown(driver: ExecutorDriver, taskId: TaskID): Unit ={}

}

MesosEndpoints

qHTTPendpointsavailableforagivenMesosprocess

http://master.com:5050/files/browse

http://master.com:5050/files/browse/files/debug

http://master.com:5050/files/browse

http://master.com:5050/api/v1/scheduler

http://master.com:5050/health

Thank YouQ/A

top related