a selective mirrored task based fault tolerance mechanism...

Research ArticleA Selective Mirrored Task Based Fault Tolerance Mechanism forBig Data Application Using Cloud

Hao Wu 1 Qinggeng Jin 2 Chenghua Zhang1 and He Guo1

1School of Software Technology Dalian University of Technology Dalian Liaoning China2Guangxi Key Laboratory of Hybrid Computation and IC Design Analysis Guangxi University for NationalitiesNanning Guangxi China

Correspondence should be addressed to Qinggeng Jin jinqinggengaliyuncom

Received 6 December 2018 Accepted 29 January 2019 Published 26 February 2019

Guest Editor Salimur Choudhury

Copyright copy 2019 HaoWu et al This is an open access article distributed under the Creative Commons Attribution License whichpermits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

With the wide deployment of cloud computing in big data processing and the growing scale of big data application managingreliability of resources becomes a critical issue Unfortunately due to the highly intricate directed-acyclic-graph (DAG) basedapplication and the flexible usage of processors (virtual machines) in cloud platform the existing fault tolerant approaches areinefficient to strike a balance between the parallelism and the topology of the DAG-based application while using the processorswhich causes a longer makespan for an application and consumes more processor time (computation cost) To address these issuesthis paper presents a novel fault tolerant framework named Fault Tolerance Algorithm using Selective Mirrored Tasks Method(FAUSIT) for the fault tolerance of running a big data application on cloud First we provide comprehensive theoretical analyseson how to improve the performance of fault tolerance for running a single task on a processor Second considering the balancebetween the parallelism and the topology of an application we present a selective mirrored task method Finally by employing theselective mirrored task method the FAUSIT is designed to improve the fault tolerance for DAG based application and incorporatestwo important objects minimizing the makespan and the computation cost Our solution approach is evaluated through rigorousperformance evaluation study using real-word workflows and the results show that the proposed FAUSIT approach outperformsexisting algorithms in terms of makespan and computation cost

1 Introduction

Recent years have witnessed that the big data analysis growsdramatically and the related applications have been usedeverywhere in both academia [1] and industry [2]There is nodenying that the developmental cloud platform technologiesplayed a key role in this process the plenty of processers incloud make sure that the scholars can handle the significantlarge-scale big data processing [3ndash6] However due to voltagefluctuation cosmic rays thermal changes or variability inmanufacturing the chip level soft errors and the physicalflaws are inevitable for a processer even when the probabilityof that is extremely low [7 8] Furthermore the abundantuse of processers by a big data application induces that theprobability cannot be ignored

Big data and big data analysis have been proposed fordescribing data sets as analytical technologies in large-scale

complex programs which need to be analyzed with advancedanalytical methods [9 10] No matter whether the bigdata applications are developed for commercial purposesor scientific researches most of these applications requiresignificant amount of computing resources such as marketstructure analysis customer trade analysis environmentalresearch and astrophysics data processing Motivated by thereasonable price rapid elasticity and shifting responsibility ofmaintenance backups and management to cloud providersmore and more big data applications have been deployed toclouds such as EC2 [11] Google Cloud [12] and MicrosoftAzure [13]

The clouds provide unlimited computing resources (fromthe userrsquos point of view) including CPU resources andGPU resources The on-demand recourses facilitate users tochoose apposite processors (with CPU or GPU resources) forexecuting their big data applications efficiently [14] However

HindawiWireless Communications and Mobile ComputingVolume 2019 Article ID 4807502 12 pageshttpsdoiorg10115520194807502

2 Wireless Communications and Mobile Computing

according to [15] there are many factors such as voltagefluctuation cosmic rays thermal changes or variability inmanufacturing which cause the processors (both CPU andGPU) to be more vulnerable Indeed the probability of faultrate is really low But as already noted plenty of processorsparticipate in computing the big data application and thecomputing time of each processor may be very long Thesepotential factors will lead to an exponential growth for thefault rate in a cycle of running a big data application

The failures caused by processors are disastrous for a bigdata application which is deployed on the processors ieonce a fault occurs on any processor the application willhave to be executed all over again and that will waste lots ofmonetary cost and time costThus improving the robustness(or reducing the fault rate) for running a big data applicationhas attracted many scholarsrsquo attention many studies havebeen exploring this problem According to the literaturethese studies are classified into twomain categories resolvingthe problem in hardware or software level

From the hardware levelrsquos perspective improving themean time between failures (MTBF) [16] of the processorsis the key to reduce the fault rate for running a big dataapplication As everyone knows it is impossible to erad-icate the failures in a processor Apart from lifting tape-out technology the [15] proposed an adaptive low-overheadfault tolerance mechanism for many-core processor whichtreats fault tolerance as a device that can be configuredand used by application when high reliability is neededAlthough [15] improves the MTBF but the risk of occurringfailures remains Therefore the other scholars seek solutionsin software level

From the software levelrsquos perspective the developedcheck-point technology [17] makes sure that a big dataapplication can be completed under any size of MTBF Thusmany check-point strategies are proposed to resolve thisproblem such as [18 19]These strategies only pay a little extracost but they can complete the applications under any sizeof MTBF As a result almost all of cloud platforms providecheck-point interface for users However these strategies didnot consider the date dependencies between the processorswhich make it inappropriate to big data application runningon cloud

In order to handle the DAG based applications the copytask based method (also known as primary backup basedmethod) is proposed to resolve the problem In [20] Qin andJiang proposed an algorithm eFRD to enable a systems faulttolerance and maximize its reliability On the basis of eFRDZhu et al [21] developed an approach for task allocationand message transmission to ensure faults can be toleratedduring the workflow execution But the task based methodswill make the makespan have a long delay due to the fact thatthe backup tasks are not starting with the original tasks

To the best of our knowledge for the big data applicationthere are no proper check-point strategies which can handlethe data dependencies among the processors simultaneouslyIn this paper we propose a novel check-point strategy forbig data applications which considers the effect of the datacommunications The proposed strategy adopts high levelfailure model to resolve the problem which makes it closer

to practice Meanwhile the subsequent effect caused by datadependencies after a failure is also considered in our strategy

The main contributions of this paper are as follows(i) A selective mirrored task method is proposed for the

fault tolerance of the key subtasks in aDAGbased application(ii) On the basis of the selective mirrored task method

a novel check-point framework is proposed to resolve thefault tolerance for a big data application running on cloudthe framework is named Fault Tolerance Algorithm usingSelective Mirrored Tasks Method (FAUSIT)

(iii) A thorough performance analysis is conducted forFAUSIT through experiments on randomly generated test bigdata application as well as real-world application traces

The rest of this paper is structured as follows The relatedwork is summarized in Section 2 Section 3 introducesthe models of the big data application the cloud platformand the MTBF and then formally defines the problem thepaper is addressing Section 4 presents the novel check-point strategy for the big data application running on cloudSection 5 conducts extensive experiments to evaluate theperformance of our algorithm Section 6 concludes the paperwith summary and future directions

2 Related Work

Over the last two decades owing to the increasing scale ofthe big data applications [22] the fault tolerance for big dataapplication is becomingmore andmore crucial Considerableresearch has been explored by scholars In this section wesummarize the research in terms of theoretic methods andheuristic methods

A number of theoretic methods have been explored byscholars For a task running on a processor Young [21]proposed a first order failure model and figured out anapproximation to the optimum check-point interval whichis 120593119900119901119905 = radic2120575119872 where 120575 is the time to write a check-point file M is the mean time between failures for theprocessor and 120593119900119901119905 is the optimum computation intervalbetween writing check-point files However the model inYoung [21] will never have more than a single failure in anygiven computation interval This assumption goes againstsome practical situations for instance there may be morethan one failure occurring in a computation interval

Due to the downside of the model in Yang [21] Daly[22] proposed a higher order failure model to estimatethe optimum check-point interval The model of Daly [22]assumes that there may be more than one failure occurringin a computation interval which is closer to the realisticsituation The optimum computation interval figured out byDaly [22] is

120593119900119901119905

= radic2120575119872[1 + 13 ( 1205752119872)12 + 19 ( 1205752119872)] minus 120575 if 120575 lt 2119872119872 if 120575 ge 2119872

(1)

Wireless Communications and Mobile Computing 3

120593119900119901119905 =

radic2120575119872 minus 120575 if 120575 lt 12119872119872 if 120575 ge 12119872

(2)

which is a good rule of thumb for most practical systemsHowever the models in both Yang [22] and Daly [22]

are aimed at one task running on a processor which arenot applicable to DAG based application running on cloudfor the following reasons First there are many subtasksrunning on different processors the completion time ofeach subtask may have influence on the successive subtasksSecond the importance of each subtask in a DAG (a DAGbased application) is different for instance the subtaskson the critical path of the DAG are more important thanthe others Therefore some scholars proposed heuristicmethods aiming at DAG based application running oncloud

Aiming to resolve the fault tolerance for DAG basedapplication running on cloud Zhu [23] and Qin [24]proposed copy task based methods In general the basicidea of copy task based methods is running an iden-tical copy of each subtasks on different processors thesubtasks and their copies can be mutually excluded intime However these approaches assume that tasks areindependent of one other which cannot meet the needsof real-time systems where tasks have precedence con-straints In [24] for given two tasks the authors definedthe necessary conditions for their backup copies to safelyoverlap in time with each other and proposed a newoverlapping scheme named eFRD (efficient fault-tolerantreliability-driven algorithm) which can tolerate processorsfailures in a heterogeneous system with fully connectednetwork

In Zhu [23] on the basis of Qin [24] the authorsestablished a real-time application fault-tolerant model thatextends the traditional copy task based model by incorpo-rating the cloud characteristics Based on this model theauthors developed approaches for subtask allocation andmessage transmission to ensure faults can be tolerated duringthe application execution and proposed a dynamic faulttolerant scheduling algorithm named FASTER (fault tolerantscheduling algorithm for real-time scientific workflow) Theexperiment results show that the FASTER is better than eFRD[24]

Unfortunately the disadvantage of copy task based meth-ods including Zhu [23] and Qin [24] is very conspicuousFirst the copy of each subtask may consume more resourceson the cloud which makes them uneconomical Secondthe copy of each subtask will be executed only when theoriginal subtask failed this process will waste a lot of timeand it will be even worse due to the DAG based applicationie the deadline of the application will be not guaranteedin most of cases if the deadline is near to the criticalpath

Thus in this paper we will combine the advantage oftheoretic methods and heuristic methods to propose a novelfault tolerance algorithm for a big data application runningon cloud

Table 1 Cost optimization factors

Symbols Definitions119879 = 119866(119881 119864) a scientific application119866(119881 119864) DAG of the tasks in 119879119862119875 the critical path of 119866(119881 119864)120591119894 the 119894-th subtask in 119881119908119894 the weight of 119894-th tasks in 119881119881 the set of 120591119894 in 119879119890119894119895 data dependence from 120591119894 to 120591119895119864 the set of 119890119894119895120575 the time to make a check-point file120593119900119901119905 the optimal check-point interval119879119882(120593) the practical completion time for the task119877 the time to read a check-point file119883 the time before a failure in an interval119879119888 the computation time for a task119878 a schedule which maps the tasks to processors119908119905(120591119894) the weight of 120591119894119901119903119890119889(120591119894) the predecessor subtasks set of 120591119894 in 119866(119881 119864)119904119906119888119888(120591119894) the successors subtasks set of 120591119894 in 119866(119881 119864)119904119906119888119888119875(120591119894) the next subtask 120591119894 on the same processor119891119905(120591119894) the finish time of 120591119894 on the 119878119890119904119905119878(120591119894) the earliest start time of 120591119894 on the 119878119897119891119905119878(120591119894) the latest finish time of 120591119894 on the 119878119904119897119886119888119896119879(120591119894) the slack time of 120591119894 on the 119878119904119897119886119888119896119879(120591119894) the indirect slack time of 120591119894 on the 119878120572(120591119894) denotes the quantitative importance of 120591119894120572 The threshold of 120572119870119890119910119879 The set of key subtasks119872119904(119879) Themakespan of the application 119879

3 Models and Formulation

In this section we introduce the models of big data appli-cation cloud platform and the failure model then formallydefine the problem this paper is addressing To improve thereadability we sum up the main notations used throughoutthis paper in Table 1

31 Big Data Application Model The model of a big dataapplication119879 is denoted by119866(119881 119864) where119866(119881 119864) representsa DAG Besides we use 119879119888 to denote the execution time ofthe critical path in 119866(119881 119864) Each node 119899119894 isin 119881 represents asubtask 120591119894 (1 le 119894 le V) of 119879 and V is the total number ofsubtasks in 119879 119882 is the set of weights in which each wirepresents the execution time of a subtask 119899119894 on a VM 119864 isthe set of edges in 119866(119881 119864) and an edge (119899119894 119899119895) represents thedependence between 119899119894 and 119899119895 ie a task can only start afterall its predecessors have been completed

Figure 1 shows an example of DAG for a big dataworkflow consisting of twelve tasks from 1205911to 12059112 The DAGvertices related to tasks in 119879 are represented by circles whilethe directed edges denote the data dependencies among thetasks


2

3

4

5

6 8

10

7 9

1

11 12e79

e911

e811

e1012

e1112

e56

e57

e510

e68

e13

e14

e45

e35

Figure 1 A DAG based application

failure

t = 0

R

X

Tw()

Figure 2 An example of a failure

32 Cloud PlatformModel A cloud platform119862 is modeled asa set of processors 1198751 1198752 119875N and N is the total numberof processors on the cloud We use 119873 to denote the numberof processors rented by users for executing an applicationActually theN ismuch greater than themountwhich the userneed In general to reduce the cost (monetary cost or timecost) the users apply proper schedule arithmetic to deploytheir big data applications on cloud But most of schedulealgorithms did not consider the failure in each processorwhichmay consume extra cost (monetary cost and time cost)

33 The Check-Point Model The check-point technologyhas been used on cloud for years which makes applicationcomplete successfully in the shortest amount of time Ingeneral the check-point model is defined as bellow ideallythe time to complete a task on a processor is denoted by119879119888 we use 120593 to denote the check-point interval After eachcomputation interval (120593) the processor makes a backup forthe current status of the system and the time consumedby this process is represented by 120575 If a failure occurs thetime consumed to read the latest backup and restart thecomputation is denoted by119877 Finally we use119879119882(120593) to denotethe practical completion time for the task running on aprocessor while the check-point interval is 120593

Referring to Figure 2 the ideal completion time for thetask is 119879119888 = 5120593 Actually there is a failure occurring after 119883time in the third interval and it takes the processor 119877 time torestart the third interval At last the practical time consumedby the tasks is 119879119882(120593) = 5120593 + 119877 + 119883 + 412057534 The Failure Model For a given MTBF (mean timebetween failures) which is denoted by 119872 according to [21]the life distribution model for mechanical and electricalequipment is described by an exponential model Thus theprobability density function is

119891 (119905) = 1119872119890minus119905119872 (3)

Then the probability of a failures occurring before time Δ119905for a processor is represented by a cumulative distributionfunction

119875 (119905 le Δ119905) = int0Δ119905

1119872119890minus119905119872119889119905 = 1 minus 119890minusΔ119905119872 (4)

Obviously the probability of successfully completing for atime Δ119905 without a failure is

119875 (119905 gt Δ119905) = 1 minus 119875 (119905 le Δ119905) = 119890minusΔ119905119872 (5)

We use 119879119888 to denote the computation time for a subtaskrunning on a processor the 120593 denotes the compute intervalbetween two check-points Then the average number ofattempts (represented by 119873119900119886) needed to complete 119879119888 is

119873119900119886 = 119879119888120593119875 (119905 gt Δ119905) = 119879119888119890Δ119905119872120593 (6)

Therefore the total number of failures during Δ119905 is thenumber of attempts minus the number of successes

119899 (Δ119905) = 119879119888119890Δ119905119872120593 minus 119879119888120593 = 119879119888120593 (119890Δ119905119872 minus 1) (7)

Notice that this assumes that we will never have more thana single failure in any given computation interval Obviouslythis assumption is relaxed in a real-life scenarioThus in [22]the scholar presented a multiple failures model

119899 (120593)= (119879119888 minus 120575 + 120575119879119888120593)

119872 minus 119864 (120593 + 120575) + 119877 119875 (120593) minus 119864 (119877 + 120593 + 120575) [1 minus 119875 (120593)] (8)

where

119864 (120593 + 120575) = 119872 + 120593 + 1205751 minus 119890(120593+120575)119872

119864 (119877 + 120593 + 120575) = 119872 + 119877 + 120593 + 1205751 minus 119890(120593+120575)119872

119875 (120593) = 119890minus(119877+120593+120575)119872(9)

The derivation process of Formula (8) is detailed in [22] wewill not repeat the process in this paper In this paper wewill take Formula (8) as the failure model in our frameworkfor two reasons first this could only provide the MTBF (119872mean time between failures) determined by statistics [19 25]and second this model is closer to reality than the othermodel in [21]

35 Definitions In order to make readers have a betterunderstanding of this algorithm we make some definitionsfirst For a given big data application 119879 = 119866(119881 119864) and aschedule 119878 we define the following terms

351 Schedule (119878) A schedule S is amap from the subtasks in119866(119881 119864) to the processors on the cloud meanwhile the starttime and the finish time of each subtask have beenfigured outIn general the S is determined by a static schedule algorithmsuch as HEFT [26] and MSMD [27]


352 Weight (119908119905(120591119894)) The 119908119905(120591119894) is the weight (executiontime) of 120591119894 running on a processor

353 Predecessors (119901119903119890119889(120591119894)) For a subtask 120591119894 its predeces-sors subtask set is defined as below

119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119894 120591119895) isin 119864 (10)

354 Successors (119904119906119888119888(120591119894)) For a subtask 120591119894 its successorssubtask set is defined as below

119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119895 120591119894) isin 119864 (11)

355 Successor on the Processor (119878119906119888119888119875(120591119894)) For a givenschedule 119878 the 119878119906119888119888119875(120591119894) of 120591119894 is the set of the next subtaskdeployed on the same processor

356 Finish Time (119891119905(120591119894)) The 119891119905(120591119894) denotes the finish timeof 120591119894 on the schedule 119878357 Earliest Start Time on the Schedule (119890119904119905119878(120591119894)) For agiven schedule 119878 the start time of 120591119894 is the 119890119904119905119878(120591119894) It should benoted that the 119890119904119905119878(120591119894) is different with the traditional earlieststart time in DAG

358 Latest FinishTime on the Schedule (119897119891119905119878(120591119894)) For a givenschedule 119878 the 119897119891119905119878(120591119894) of 120591119894 is defined below

119897119891119905119878 (120591119894) = 119891119905 (120591119894) if 119904119906119888119888 (120591119894) and 119878119906119888119888119875 (120591119894) are empty

min120591119895isin119904119906119888119888(120591119894)or119878119906119888119888119875(120591119894)

119890119904119905119878 (120591119894) otherwise (12)

Samewith 119890119904119905119878(120591119894) the 119897119891119905119878(120591119894) is different with the traditionalearliest start time in DAG

359 Slack Time (119904119897119886119888119896119879(120591119894)) For a given schedule S the119904119897119886119888119896119879(120591119894) of 120591119894 is defined as follows

119904119897119886119888119896119879 (120591119894) = 119897119891119905119878 (120591119894) minus 119890119904119905119878 (120591119894) minus 119908119890119894119892ℎ119905 (120591119894) (13)

3510 Indirect Slack Time (119904119897119886119888119896119879(120591119894)) For a given scheduleS the 119904119897119886119888119896119879(120591119894) of 120591119894 is defined as follows

119904119897119886119888119896119879 (120591119894) =

119904119897119886119888119896119879 (120591119894) if 119904119897119886119888119896119879 (120591119894) = 0min 119908119905 (120591119894)119908119891 (120591119894) + 119908119905 (120591119895)119904119897119886119888119896119879 (120591119895) else if 119904119906119888119888 (120591119894) or 119878119906119888119888119875 (120591119894) = 00 otherwise

(14)

The 119904119897119886119888119896119879(120591119894) denotes that the slack time of a subtask can beshared by its predecessors

3511 Importance (120572(120591119894)) The 120572(120591119894) denotes the quantitativeimportance of the subtask 120591119894 in the schedule 119878 The 120572(120591119894) iscalculated by

120572 (120591119894) = 119904119897119886119888119896119879 (120591119894)119908119905 (120591119894) (15)

3512 Threshold of Importance (120572) The threshold 120572 denoteswhether a subtask is a key subtask ie if 120572(120591119894) lt 120572 then thesubtask 120591119894 is a key subtask3513 The Set of Key Subtask (119870119890119910119879) The signal 119870119890119910119879denotes the set of key subtask for a given schedule 119878

36 Problem Formalization The ultimate objective of thiswork is to provide a high-performance fault tolerancemecha-nism andmake sure that the proposed fault tolerance mecha-nism will consume less computation cost and makespanThecomputation cost represents the processor time consumedby all the subtasks in 119879 thus the object of minimizingcomputation cost is defined by

Minimize sum120591119894isin119879

119879119882 (120591119894) (16)

The makespan can be defined as the overall time to executethe whole workflow by considering the finish time of the lastsuccessfully completed task For an application 119879 this objectis denoted by

Minimize 119872119904 (119879) (17)

4 The Fault Tolerance Algorithm

In this section we first discuss the basic idea of our algorithmfor the fault tolerance of running big data application on


cloudThen on the basis of the idea we will propose the faulttolerance algorithm using Selective Mirrored Tasks Method

41 The Basic Idea As show in Section 2 the theoreticmethods which are devoted to find the optimal 120593119900119901119905 are notapplicable to the DAG based application even if the 120593119900119901119905 theyhave determined is very accurate for one task running on aprocessor Besides the heuristic methods based on the copytaskwill waste a lot of extra resource and the completion timeof the application may be delayed by much more time

To find a better solution we will integrate the advan-tages of theoretic methods and heuristic methods to pro-pose a high-performance and economical algorithm for bigdata application The check-point method with an optimalcomputation interval 120593119900119901119905 is a dominant and economicalmethod for one task running on a processor thus thecheck-point mechanism is the major means in our approachFurthermore owing to the parallelism and the dependen-cies in a DAG based application the importance of eachsubtask is different ie the subtasks on the critical pathare more important than the others The fault toleranceperformance of these subtasks which adopt check-pointmethod is insufficient because the completion time of anapplication depends to a great extent on the completion timeof these subtasks Therefore for the important subtasks wewill improve the fault tolerance performance of an applicationby introducing the task copy based method In the taskcopy based methods [24] the original task and the copydo not start at the same time to reduce the completiontime the original task and the copy will start at the sametime

In summary the basic idea is as follows First identifythe important subtasks in the DAG based application whichare named as key subtasks in the rest of this article Thenapply the task copy based methods to the key subtasksmeanwhile all the subtasks will employ the check-pointtechnology to improve the fault tolerance performance butthe key subtasks and the normal subtasks will use differentoptimal computation interval 120593119900119901119905 the details of which willbe described in the following sections

It should be noted that our fault tolerance algorithm willnot schedule the subtasks on the processors we just providea fault tolerance mechanism based on the existing staticscheduler algorithm (such as HEFT and MSMD) to makesure that the application can be completedwith theminimumof time

42 Fault Tolerance Algorithm Using Selective Mirrored TasksMethod (FAUSIT) Based on the basic idea above we proposethe FAUSIT to improve the fault tolerance for executinga large-scale big data application the pseudocode of theFAUSIT is listed in Algorithm 1

As shown in Algorithm 1 the input of FAUSIT isa map from subtasks to processors which is determinedby a static scheduler algorithm and the output is faulttolerance operation determined by FAUSIT The functionDetermineKeyTasks() in Line (1) is to find the key subtasksaccording to the schedule 119878 and the application 119879 Then the

Input a schedule S of an application 119879Output a fault tolerance operation 119865(1)DetermineKeyTasks(119878 119879) 997888rarr 119870119890119910119879(2)DeployKeyTasks(119870119890119910119879) 997888rarr 119865(3) return 119865

Algorithm 1 Fault Tolerance Algorithm using Selective MirroredTasks Method (FAUSIT)

functionDeployKeyTasks() in Line (2) deploys the mirroredsubtasks and determine the proper 120593119900119901119905 for the subtasks

In the following content in this subsection we willexpound the two functions in FAUSIT

421 Function of DetermineKeyTasks() The function ofDetermineKeyTasks() is to determine the key subtasks in theDAGbased application In order tomake readers have a betterunderstanding of this function we need to expound the keysubtask and the indirect slack time clearly

Definition 1 Key subtask in a schedule 119878 the finish time of asubtask has influence on the start time of its successors if theinfluence exceeds a threshold we define the subtask is a keysubtask

The existence of the key subtasks is very meaningful toour FAUSIT algorithm For a given schedule 119878 in the idealcase each subtask as well as the application will be finishedin accordance with the 119878 In practice a subtask may failwhen it has executed for a certain time then the processorwill load the latest check-point files for continuation At thispoint the delay produced by the failure subtaskmay affect thesuccessors For the subtasks which have sufficient slack timethe start time of the successors is free from the failed subtaskOn the contrary if the failed subtask has little slack time itwill affect the start time of the successors undoubtedly Givenall that we need to deal with the key subtasks which has littleslack time

Definition 2 Indirect slack time for two subtasks 120591119894 and 120591119895120591119895 is the successor of 120591119894 if 120591119895 has slack time (defined inSection 359) the slack time can be shared by 120591119894 and theshared slack time is indirect slack time for 120591119894

The indirect slack time is a useful parameter in ourFAUSIT algorithm the existence of which will make theFAUSIT save a lot of time (makespan of the application) andcost For a given schedule 119878 a subtask may have sufficientslack timewhich can be shared by predecessorsThus the pre-decessors may have enough slack time to deal with failuresthen the completion time of the predecessors and the subtaskwill not delay the makespan of the application Indeed theindirect slack time is the key parameter to determine whethera subtask is a key subtask Moreover the indirect slack timereduces the count of the key subtasks in a big data applicationwhichwill save a lot of cost because the key subtaskwill applymirrored task method


Input a schedule S of an application 119879 120572Output the key subtask set 119870119890119910119879(1) Determine the 119891119905() 119890119904119905119878() and the 119897119891119905119878() of the sub-tasks in 119879 according to 119878(2) Determine the 119904119897119886119888119896119879() of each subtask(3) Determine the 119904119897119886119888119896119879() of the subtasks by recursion(4) Determine the set of key subtasks according to 120572 997888rarr 119870119890119910119879(5) return 119870119890119910119879

Algorithm 2 DetermineKeyTasks()

The pseudocode for function DetermineKeyTasks() isshown in Algorithm 2 The input of DetermineKeyTasks()is a schedule 119878 of an application 119879 and the threshold 120572the output is the set of key subtasks First in Line (1) the119891119905() 119890119904119905119878() and the 119897119891119905119878() of the subtasks are determinedaccording to Sections 356 and 357 and Formula (12) Thenthe 119904119897119886119888119896119879() of each subtask is figured out by Formula (13)in Line (2) Line (3) determines the 119904119897119886119888119896119879() of the subtasksby recursion Finally the set of key subtasks is determinedaccording to the threshold 120572

Table 2 shows the process of DetermineKeyTasks()When the 120572 = 015 Figure 3 shows the key subtaskswhich shall adopt the mirrored task method to improve theperformance of fault tolerance

It should be noted that the threshold 120572 is given by theusers which is related to themakespan of the application iethe higher 120572 leads to more key subtasks Then the makespanof the application is shorter On the contrary the smaller 120572will lead to a longer makespan

422 Function of DeployKeyTasks() The function ofDeployKeyTasks() is to deal with the key subtasks whichminimizes the makespan of the application to the leastextent The main operation of DeployKeyTasks() is usingmirrored task method in order to make readers have a betterunderstanding of this function we need to expound themirrored task method first

The mirrored task method is to deploy a copy of a keysubtask on another processor the original subtask is denotedby 120591119875119894 and the copy of the subtask is denoted by 120591119862119894 The120591119875119894 and the 120591119862119894 start at the same time and the check-pointinterval of them is 2120593119900119901119905 (the 120593119900119901119905 is determined by [22]) Thedistinction between the 120591119875119894 and the 120591119862119894 is that the first check-point interval of 120591119875119894 is 120593119900119901119905 meanwhile the first check-pointinterval of 120591119862119894 is 2120593119900119901119905 Obviously once a failure occurs inone of the processors the interlaced check-point interval ofthe two same subtasks makes sure that the time delayed bydealing with the failure is 119882 (the time to read a check-pointfile)

Figure 4 shows an example of the mirrored task methodThe Figure 4(a) displays the ideal situation of the mirroredtask method ie there are no failures happen in both 119875119896 and119875119897 and the finish time is 4120593 + 2120575 In Figure 4(b) there isonly one failure happening on 119875119897 in the second check-point

interval 1205932 First the processor 119875119897 reads the latest check-point file named 1205751119896 Then with time goes by the processor119875119897 will immediately load the latest check-point file 1205753119896 when itgenerates Thus the finish time is 4120601 + 2120575 + W Figure 4(c)illustrates the worst case both 119875119896 and 119875119897 have a failure in thesame interval 1205932 the two processors will have to load thelatest check-point file 1205751119896 Thus the finish time is 4120593 + 2120575 +119882 + 119883

Obviously themirrored taskmethod is far better than thetraditional copy task based method since the copy task willstart only when the original task failed and it will waste a lotof time Moreover the mirrored task method is also betterthan the method in [22] since the probability of the worstcase is far less than the probability of the occurrence for onefailure in a processor

The pseudocode for function DeployKeyTasks() is dis-played in Algorithm 3 The loop in Line (1) makes sure thatall the key subtasks can be deploy on the processors Theapplied processors should be used first when deploying thekey subtasks the loop in Line (2) illustrates this constraint

Line (3) makes sure that the key subtasks 120591119861119894 have nooverlapping with other tasks on 119875119895 Lines (4)-(5) deploy 120591119861119894on 119875119895 If all the applied processors have no idle interval for 120591119861119894(Line (8)) we will have to apply a new processor and deploy120591119861119894 on it then we put the new processor into the set of appliedprocessors (Lines (9)-(11)) At last we deploy the check-pointinterval (120593119900119901119905 or 2120593119900119901119905) to the processors (Line (14)) and savethese operations in 119879 (Line (15))

It should be noted that the overlapping in Line (3) is notjust the computation time of the 120591119861119894 we also consider thedelayed time which may be caused by failures In order toavoid the overlap caused by the delayed 120591119861119894 we use the 13119908119894as the execution time of 120591119861119894 to determine whether an overlaphappen since the 13119908119894 is much greater than a delayed 120591119861119894 43 The Feasibility Study of FAUSIT The operations to theprocessors in our FAUSIT are complex such as the differentcheck-point interval and the mirrored subtasks the readersmay doubt the feasibility of FAUSITThus we will explain thefeasibility study of FAUSIT in this subsection

According to [15] Google Cloud provides the gcloudsuit for users to operate the processors (virtual machine)remotely The operations of gcloud include (but not limitedto) applying processors releasing processors check-point


75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0

= 015denotes the sub-tasks needed to applythe mirrored tasks method

St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3

Figure 3 An example of 120572 = 015

time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci

(a) The ideal sit-uation

time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l

(b) One failure occurs

time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l

(c) Two failures occur simultane-ously

Figure 4 An example of mirrored tasks method

Table 2 An example of DetermineKeyTasks()

Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0


Input the key subtask set 119910119879Output a fault tolerance operation 119865(1) for 120591119861119894 in 119870119890119910119879 do(2) for 119875119895 in the processors which have been applied do(3) if 120591119861119894 has no overlapping with other tasks on 119875119895 then(4) Deploy the 120591119861119894 on 119875119895(5) End for in Line (2)(6) end if(7) end for(8) if 120591119861119894 has not be deployed on the processor then(9) Apply a new processor(10) Deploy 120591119861119894 on the new processor(11) Put the new processor in the set of processors(12) end if(13) end for(14) Deploy the check-point interval to the processors(15) Save the operations in 119865(16) return 119865

Algorithm 3DeployKeyTasks()

Table 3 The Characteristics of the Big Data Application

Workflows Task CountRange

EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228

and loading check-point files These operations in gcloud canmake user implement the FAUSIT easily

5 Empirical Evaluations

The purpose of the experiments is to evaluate the perfor-mance of the developed FAUSIT algorithm We evaluate thefault tolerant of FAUSIT by comparing it with two otheralgorithms published in the literature They are FASTERalgorithm [23] and the method in [22] The main differencesof these algorithms to FAUSIT and their uses are brieflydescribed below

(i) FASTER a novel copy task based fault tolerantscheduling algorithm On the basis of copy task basedmethod the FASTER adjust the resources dynami-cally

(ii) The method in [22] it is a theoretic method whichprovides an optimal check-point interval 120593119900119901119905

51 Experiment Settings The DAG based big data appli-cations we used for the evaluation are obtained from theDAG based applications benchmark provided by PegasusWorkflowGenerator [28]We use the largest application fromthe benchmark ie Epigenomics since the bigger application

is more sensitive to failures The detailed characteristics ofthe benchmark applications can be found in [29 30] Inour experiments the number of subtasks in an applicationis ranging from 100 to 1000 Since the benchmark does notassign deadlines for each application we need to specifythe deadlines we assign a deadline for the applications it is10119862119875 Table 3 gives the characteristics of these applicationsincluding the count of tasks and edges average task executiontime and the deadlines

Because FAUSIT does not schedule the subtasks on theprocessors we hire a high-performance schedule algorithm(MSMD) to determine the map from subtasks to the proces-sors MSMD is a novel static schedule algorithm to reducethe cost and the makespan for an application On the basisof MSMD we use FAUSIT to improve the fault tolerant of anapplication

52 Evaluation Criteria The main objectives of the FAUSITare to find the optimal fault tolerant mechanism for a bigdata application running on cloud The first criterion toevaluate the performance of a fault tolerant mechanism ishow much time is delayed to complete the application iethe makespan to finish the application We introduce theconcept of makespan delay rate to indicate how much extratime consumed by the fault tolerant mechanism It is definedas follows

119872119886119896119863119890119897119877119886119905119890 = The practical makespanThe ideal makespan

(18)

where the ideal makespan represents the completion time forrunning an application without fault tolerantmechanism andany failures and the practical makespan is the completiontime consumed on practical system which has a fault tolerantmechanism and the probability of failures


ExCostR

ate

14

13

12

2 middot 10minus2 4 middot 10minus2 6 middot 10minus2 8 middot 10minus2 01

MakDelRa

te

13

12

11


(a) (b)

Figure 5 (a) TheMakDelRate and (b) the ExCostRate of 120572

MakDelRa

te 14

12

FAUSIT

FASTERlowastDalyrsquos

count of sub-tasks1000900800700600500400300200100

(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate

Figure 6 The result ofMakDelRate and ExCostRate

The second goal of the FARSIT is to minimize the extracost consumed by the fault tolerant mechanism Undoubt-edly to improve the performance of fault tolerant any faulttolerantmechanismswill have to consume extra computationtime for an application Thus the extra computation time isa key criterion to evaluate the performance of fault tolerantmechanism Therefore we define extra cost rate for theevaluation

119864119909119862119900119904119905119877119886119905119890 = The practical processors timeThe ideal processors time

(19)

where the practical processors time denotes the time con-sumed by the processors to running an application for apractical systemwith failuresThe ideal processors time is thesum of the 119908119894 in 119866(119881 119864) for an application

53 The Optimal 120572 for FAUSIT Before comparing with theother algorithms we need to determine the optimal 120572 forFAUSIT first Due to the optimal 120572 is an experimentalparameter we have to figure it out by experiments

We test the 120572 by the Epigenomics application with 1000subtasks for 10 times and make the 119879119889 = 10119879119888 the resultsare shown in Figures 5(a) and 5(b) In Figure 5(a) theMakDelRate becomes lower alone with the increased ie alarger 120572 will lead to a shorter makespan for an application

On the contrary Figure 5(b) shows that the larger 120572 will leadto more cost for an application

Through a comprehensive analysis of Figures 5(a) and5(b) we make the 120572 = 0033 which can equalize theMakDelRate and the ExCostRate and make sure that boththeMakDelRate and the ExCostRate are smaller than othersituations

54 Comparison with the Other Solutions Since the proposedFAUSIT algorithm is a heuristic algorithm themost straight-forward way to evaluate its performance is to compare withthe optimal solution when possibleWe randomly generate 10different applications for each scale of Epigenomics shown inTable 3 and we make the 119879119889 = 10119862119875

What should be paid attention to is that although theFASTER is a copy tasks based method but it is developedfor multiple DAG based big data applications As a resultit cannot compare the performance of it with the othertwo methods directly In order to make a comprehensivecomparison on the basis of FASTER we modified theFASTER to make it handle a single DAG based applicationand this new FASTER is denoted by FASTERlowast

The average of MakDelRate and ExCostRate are dis-played in Figure 6 In Figure 6(a) our FAUSIT has the min-imum MakDelRate for any size of application Meanwhile


the FASTERlowast has higherMakDelRate than our FAUSIT andthe Dalyrsquos method has the highestMakDelRateThe result inFigure 6(a) shows that our FAUSIT has the best performancefor minimizing the makespan of an application among thesemethods

In Figure 6(b) due to the fact that the Dalyrsquos methodonly adopts the check-pointmechanism this makes it has theminimum ExCostRate Meanwhile the FASTERlowast consumesthe maximum cost since the ExCostRate of it is verylarge Interestingly along with the increase of the count ofsubtasks in an application the ExCostRate of our FAUSIT isbecoming smaller which indicates that our FAUSIT has theability to handle much bigger scale of application

In conclusion compared with the FASTER ourFAUSIT outperforms FASTER on both MakDelRateand ExCostRate Compared with the Dalyrsquos method ourFAUSIT outperforms it much more on MakDelRate andonly consume 9 extra cost which still makes our FAUSIThave competitiveness in practical system Besides the 120572in our FAUSIT can satisfy the requirements from differentusers ie if the users need a shorter makespan for anapplication they can turn the 120572 up On the contrary if theusers care about the cost they can turn the 120572 down Thusthe 120572 make the FAUSIT have strong usability

6 Conclusion

This paper investigates the problem of improving the faulttolerant for a big data application running on cloud We firstanalyze the characteristics of running a task on a processorThen we present a new approach called selective mirroredtask method to deal with the imbalance between the paral-lelism and the topology of the DAG based application run-ning on multiple processors Based on the selective mirroredtask method we proposed an algorithm named FAUSIT toimprove the fault tolerant for a big data application runningon cloudmeanwhile themakespan and the computation costis minimized To evaluate the effectiveness of FAUSIT weconduct extensive simulation experiments in the context ofrandomly generated workflows which are real-world appli-cation traces Experimental results show the superiorities ofFAUSIT compared with other related algorithms such asFASTER and Dalyrsquos method In our future work due to thesuperiorities of selective mirrored task method we will try toapply it to other big data applications processing scenariossuch as improving the fault tolerant of multiple applicationsin the respect of could providers Furthermore we will alsoinvestigate the effectiveness of the selective mirrored taskmethod in parallel real-time applications running on cloud

Data Availability

The data used to support the findings of this study areavailable from the corresponding author upon request

Conflicts of Interest

The authors declare that they have no conflicts of interest

Acknowledgments

This work was supported partly by Doctoral Research Foun-dation of Liaoning under grant no 20170520306 GuangxiKey Laboratory of Hybrid Computation and IC DesignAnalysisOpen Fund (HCIC201605) Guangxi YouthCapacityImprovement Project (2018KY0166) and SupercomputingCenter of Dalian University of Technology

References

[1] M Armbrust A Fox R Griffith et al ldquoA view of cloudcomputingrdquo Communications of the ACM vol 53 no 4 pp 50ndash58 2010

[2] Market Research Media ldquoGlobal cloud computing marketforecast 2019-2024rdquo httpswwwmarketresearchmediacomp=839

[3] L F Sikos ldquoBig data applicationsrdquo inMastering Structured Dataon the Semantic Web 2015

[4] G-H Kim S Trimi and J-H Chung ldquoBig-data applications inthe government sectorrdquo Communications of the ACM vol 57no 3 pp 78ndash85 2014

[5] G Wang T S E Ng and A Shaikh ldquoProgramming yournetwork at run-time for big data applicationsrdquo in Proceedings ofthe 1st ACM International Workshop on Hot Topics in SoftwareDefined Networks (HotSDN rsquo12) pp 103ndash108 ACM HelsinkiFinland August 2012

[6] P Costa A Donnelly A Rowstron and G OrsquoShea ldquoCamdoopExploiting in-network aggregation for big data applicationsrdquo inProceedings of the 9th USENIX conference on Networked SystemsDesign and Implementation USENIX Association 2012

[7] H Liu H Ning Y Zhang Q Xiong and L T Yang ldquoRole-dependent privacy preservation for secure V2G networks inthe smart gridrdquo IEEE Transactions on Information Forensics andSecurity vol 9 no 2 pp 208ndash220 2014

[8] E Sindrilaru A Costan and V Cristea ldquoFault tolerance andrecovery in grid workflowmanagement systemsrdquo in Proceedingsof the 4th International Conference on Complex Intelligentand Software Intensive Systems (CISIS rsquo10) pp 475ndash480 IEEEFebruary 2010

[9] Q Zheng ldquoImprovingMapReduce fault tolerance in the cloudrdquoin Proceedings of the IEEE International Symposium on Par-allel and Distributed Processing Workshops and Phd Forum(IPDPSW rsquo10) IEEE April 2010

[10] L Peng et al ldquoDeep convolutional computation model forfeature learning on big data in internet of thingsrdquo IEEETransactions on Industrial Informatics vol 14 no 2 pp 790ndash798 2018

[11] Q Zhang L T Yang Z Yan Z Chen and P Li ldquoAn efficientdeep learning model to predict cloud workload for industryinformaticsrdquo IEEE Transactions on Industrial Informatics vol14 no 7 pp 3170ndash3178 2018

[12] L Man and L T Yang ldquoHybrid genetic algorithms for schedul-ing partially ordered tasks in amulti-processor environmentrdquo inProceedings of the Sixth International Conference on Real-TimeComputing Systems and Applications (RTCSA rsquo99) IEEE 1999

[13] Q Zhang L T Yang and Z Chen ldquoDeep computationmodel for unsupervised feature learning on big datardquo IEEETransactions on Services Computing vol 9 no 1 pp 161ndash1712016


[14] ldquoCluster networking in ec2rdquo httpsamazonaws-chinacomec2instance-types

[15] ldquoGoogle cloudrdquo httpscloudgooglecom[16] ldquoMicrosoft azurerdquo httpsazuremicrosoftcom[17] J Rao Y Wei J Gong and C-Z Xu ldquoQoS guarantees and

service differentiation for dynamic cloud applicationsrdquo IEEETransactions on Network and Service Management vol 10 no1 pp 43ndash55 2013

[18] J Wentao Z Chunyuan and F Jian ldquoDevice view redundancyAn adaptive low-overhead fault tolerancemechanism formany-core systemrdquo in Proceedings of the IEEE 10th International Con-ference on High Performance Computing and Communicationsampamp 2013 IEEE International Conference on Embedded andUbiquitous Computing (HPCC EUC rsquo13) IEEE 2013

[19] M Engelhardt and L J Bain ldquoOn the mean time betweenfailures for repairable systemsrdquo IEEE Transactions on Reliabilityvol 35 no 4 pp 419ndash422 1986

[20] H Akkary R Rajwar and S T Srinivasan ldquoCheckpoint pro-cessing and recovery towards scalable large instructionwindowprocessorsrdquo in Proceedings of the 36th annual IEEEACM Inter-national Symposium on Microarchitecture pp 423ndash434 IEEEComputer Society San Diego Calif USA 2003

[21] J W Young ldquoA first order approximation to the optimumcheckpoint intervalrdquo Communications of the ACM vol 17 no9 pp 530-531 1974

[22] J T Daly ldquoA higher order estimate of the optimum checkpointinterval for restart dumpsrdquo Future Generation Computer Sys-tems vol 22 no 3 pp 303ndash312 2006

[23] X Zhu J Wang H Guo D Zhu L T Yang and L LiuldquoFault-tolerant scheduling for real-time scientific workflowswith elastic resource provisioning in virtualized cloudsrdquo IEEETransactions on Parallel and Distributed Systems vol 27 no 12pp 3501ndash3517 2016

[24] X Qin and H Jiang ldquoA novel fault-tolerant scheduling algo-rithm for precedence constrained tasks in real-time heteroge-neous systemsrdquo Parallel Computing SystemsampApplications vol32 no 5-6 pp 331ndash356 2006

[25] S Mu M Su P Gao Y Wu K Li and A Y Zomaya ldquoCloudstorage over multiple data centersrdquo Handbook on Data Centerspp 691ndash725 2015

[26] HTopcuoglu SHariri andMWu ldquoPerformance-effective andlow-complexity task scheduling for heterogeneous computingrdquoIEEE Transactions on Parallel and Distributed Systems vol 13no 3 pp 260ndash274 2002

[27] H Wu X Hua Z Li and S Ren ldquoResource and instance hourminimization for deadline constrained DAG applications usingcomputer cloudsrdquo IEEETransactions on Parallel andDistributedSystems vol 27 no 3 pp 885ndash899 2016

[28] G Juve ldquoworkflowgeneratorrdquo httpsconfluencepegasusisiedudisplaypegasusworkflowgenerator 2014

[29] S Bharathi A Chervenak E Deelman GMehtaM Su and KVahi ldquoCharacterization of scientific workflowsrdquo in Proceedingsof the 3rd Workshop on Workflows in Support of Large-ScaleScience (WORKS rsquo08) pp 1ndash10 IEEE November 2008

[30] G Juve A Chervenak E Deelman S Bharathi G MehtaandKVahi ldquoCharacterizing and profiling scientificworkflowsrdquoFuture Generation Computer Systems vol 29 no 3 pp 682ndash6922013

International Journal of

AerospaceEngineeringHindawiwwwhindawicom Volume 2018

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018


Active and Passive Electronic Components

VLSI Design



Shock and Vibration


Civil EngineeringAdvances in

Acoustics and VibrationAdvances in



Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawiwwwhindawicom

Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Control Scienceand Engineering

Journal of



Journal ofEngineeringVolume 2018

SensorsJournal of



RotatingMachinery


Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018


Chemical EngineeringInternational Journal of Antennas and

Propagation




Navigation and Observation


Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

Submit your manuscripts atwwwhindawicom


according to [15] there are many factors such as voltagefluctuation cosmic rays thermal changes or variability inmanufacturing which cause the processors (both CPU andGPU) to be more vulnerable Indeed the probability of faultrate is really low But as already noted plenty of processorsparticipate in computing the big data application and thecomputing time of each processor may be very long Thesepotential factors will lead to an exponential growth for thefault rate in a cycle of running a big data application

The failures caused by processors are disastrous for a bigdata application which is deployed on the processors ieonce a fault occurs on any processor the application willhave to be executed all over again and that will waste lots ofmonetary cost and time costThus improving the robustness(or reducing the fault rate) for running a big data applicationhas attracted many scholarsrsquo attention many studies havebeen exploring this problem According to the literaturethese studies are classified into twomain categories resolvingthe problem in hardware or software level

From the hardware levelrsquos perspective improving themean time between failures (MTBF) [16] of the processorsis the key to reduce the fault rate for running a big dataapplication As everyone knows it is impossible to erad-icate the failures in a processor Apart from lifting tape-out technology the [15] proposed an adaptive low-overheadfault tolerance mechanism for many-core processor whichtreats fault tolerance as a device that can be configuredand used by application when high reliability is neededAlthough [15] improves the MTBF but the risk of occurringfailures remains Therefore the other scholars seek solutionsin software level

From the software levelrsquos perspective the developedcheck-point technology [17] makes sure that a big dataapplication can be completed under any size of MTBF Thusmany check-point strategies are proposed to resolve thisproblem such as [18 19]These strategies only pay a little extracost but they can complete the applications under any sizeof MTBF As a result almost all of cloud platforms providecheck-point interface for users However these strategies didnot consider the date dependencies between the processorswhich make it inappropriate to big data application runningon cloud

In order to handle the DAG based applications the copytask based method (also known as primary backup basedmethod) is proposed to resolve the problem In [20] Qin andJiang proposed an algorithm eFRD to enable a systems faulttolerance and maximize its reliability On the basis of eFRDZhu et al [21] developed an approach for task allocationand message transmission to ensure faults can be toleratedduring the workflow execution But the task based methodswill make the makespan have a long delay due to the fact thatthe backup tasks are not starting with the original tasks

To the best of our knowledge for the big data applicationthere are no proper check-point strategies which can handlethe data dependencies among the processors simultaneouslyIn this paper we propose a novel check-point strategy forbig data applications which considers the effect of the datacommunications The proposed strategy adopts high levelfailure model to resolve the problem which makes it closer

to practice Meanwhile the subsequent effect caused by datadependencies after a failure is also considered in our strategy

The main contributions of this paper are as follows(i) A selective mirrored task method is proposed for the

fault tolerance of the key subtasks in aDAGbased application(ii) On the basis of the selective mirrored task method

a novel check-point framework is proposed to resolve thefault tolerance for a big data application running on cloudthe framework is named Fault Tolerance Algorithm usingSelective Mirrored Tasks Method (FAUSIT)

(iii) A thorough performance analysis is conducted forFAUSIT through experiments on randomly generated test bigdata application as well as real-world application traces

The rest of this paper is structured as follows The relatedwork is summarized in Section 2 Section 3 introducesthe models of the big data application the cloud platformand the MTBF and then formally defines the problem thepaper is addressing Section 4 presents the novel check-point strategy for the big data application running on cloudSection 5 conducts extensive experiments to evaluate theperformance of our algorithm Section 6 concludes the paperwith summary and future directions

2 Related Work

Over the last two decades owing to the increasing scale ofthe big data applications [22] the fault tolerance for big dataapplication is becomingmore andmore crucial Considerableresearch has been explored by scholars In this section wesummarize the research in terms of theoretic methods andheuristic methods

A number of theoretic methods have been explored byscholars For a task running on a processor Young [21]proposed a first order failure model and figured out anapproximation to the optimum check-point interval whichis 120593119900119901119905 = radic2120575119872 where 120575 is the time to write a check-point file M is the mean time between failures for theprocessor and 120593119900119901119905 is the optimum computation intervalbetween writing check-point files However the model inYoung [21] will never have more than a single failure in anygiven computation interval This assumption goes againstsome practical situations for instance there may be morethan one failure occurring in a computation interval

Due to the downside of the model in Yang [21] Daly[22] proposed a higher order failure model to estimatethe optimum check-point interval The model of Daly [22]assumes that there may be more than one failure occurringin a computation interval which is closer to the realisticsituation The optimum computation interval figured out byDaly [22] is

120593119900119901119905

= radic2120575119872[1 + 13 ( 1205752119872)12 + 19 ( 1205752119872)] minus 120575 if 120575 lt 2119872119872 if 120575 ge 2119872

(1)


120593119900119901119905 =


(2)














2

3

4

5

6 8

10

7 9

1

11 12e79

e911

e811

e1012

e1112

e56

e57

e510

e68

e13

e14

e45

e35


failure

t = 0

R

X

Tw()





119891 (119905) = 1119872119890minus119905119872 (3)


119875 (119905 le Δ119905) = int0Δ119905





119873119900119886 = 119879119888120593119875 (119905 gt Δ119905) = 119879119888119890Δ119905119872120593 (6)


119899 (Δ119905) = 119879119888119890Δ119905119872120593 minus 119879119888120593 = 119879119888120593 (119890Δ119905119872 minus 1) (7)


119899 (120593)= (119879119888 minus 120575 + 120575119879119888120593)


where

119864 (120593 + 120575) = 119872 + 120593 + 1205751 minus 119890(120593+120575)119872

119864 (119877 + 120593 + 120575) = 119872 + 119877 + 120593 + 1205751 minus 119890(120593+120575)119872

119875 (120593) = 119890minus(119877+120593+120575)119872(9)







119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119894 120591119895) isin 119864 (10)


119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119895 120591119894) isin 119864 (11)




119897119891119905119878 (120591119894) = 119891119905 (120591119894) if 119904119906119888119888 (120591119894) and 119878119906119888119888119875 (120591119894) are empty

min120591119895isin119904119906119888119888(120591119894)or119878119906119888119888119875(120591119894)

119890119904119905119878 (120591119894) otherwise (12)



119904119897119886119888119896119879 (120591119894) = 119897119891119905119878 (120591119894) minus 119890119904119905119878 (120591119894) minus 119908119890119894119892ℎ119905 (120591119894) (13)


119904119897119886119888119896119879 (120591119894) =


(14)



120572 (120591119894) = 119904119897119886119888119896119879 (120591119894)119908119905 (120591119894) (15)




119879119882 (120591119894) (16)


Minimize 119872119904 (119879) (17)




































75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0


St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3


time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci


time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l


time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l




Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0






EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia



120593119900119901119905 =


(2)














2

3

4

5

6 8

10

7 9

1

11 12e79

e911

e811

e1012

e1112

e56

e57

e510

e68

e13

e14

e45

e35


failure

t = 0

R

X

Tw()





119891 (119905) = 1119872119890minus119905119872 (3)


119875 (119905 le Δ119905) = int0Δ119905





119873119900119886 = 119879119888120593119875 (119905 gt Δ119905) = 119879119888119890Δ119905119872120593 (6)


119899 (Δ119905) = 119879119888119890Δ119905119872120593 minus 119879119888120593 = 119879119888120593 (119890Δ119905119872 minus 1) (7)


119899 (120593)= (119879119888 minus 120575 + 120575119879119888120593)


where

119864 (120593 + 120575) = 119872 + 120593 + 1205751 minus 119890(120593+120575)119872

119864 (119877 + 120593 + 120575) = 119872 + 119877 + 120593 + 1205751 minus 119890(120593+120575)119872

119875 (120593) = 119890minus(119877+120593+120575)119872(9)







119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119894 120591119895) isin 119864 (10)


119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119895 120591119894) isin 119864 (11)




119897119891119905119878 (120591119894) = 119891119905 (120591119894) if 119904119906119888119888 (120591119894) and 119878119906119888119888119875 (120591119894) are empty

min120591119895isin119904119906119888119888(120591119894)or119878119906119888119888119875(120591119894)

119890119904119905119878 (120591119894) otherwise (12)



119904119897119886119888119896119879 (120591119894) = 119897119891119905119878 (120591119894) minus 119890119904119905119878 (120591119894) minus 119908119890119894119892ℎ119905 (120591119894) (13)


119904119897119886119888119896119879 (120591119894) =


(14)



120572 (120591119894) = 119904119897119886119888119896119879 (120591119894)119908119905 (120591119894) (15)




119879119882 (120591119894) (16)


Minimize 119872119904 (119879) (17)




































75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0


St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3


time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci


time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l


time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l




Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0






EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia



2

3

4

5

6 8

10

7 9

1

11 12e79

e911

e811

e1012

e1112

e56

e57

e510

e68

e13

e14

e45

e35


failure

t = 0

R

X

Tw()





119891 (119905) = 1119872119890minus119905119872 (3)


119875 (119905 le Δ119905) = int0Δ119905





119873119900119886 = 119879119888120593119875 (119905 gt Δ119905) = 119879119888119890Δ119905119872120593 (6)


119899 (Δ119905) = 119879119888119890Δ119905119872120593 minus 119879119888120593 = 119879119888120593 (119890Δ119905119872 minus 1) (7)


119899 (120593)= (119879119888 minus 120575 + 120575119879119888120593)


where

119864 (120593 + 120575) = 119872 + 120593 + 1205751 minus 119890(120593+120575)119872

119864 (119877 + 120593 + 120575) = 119872 + 119877 + 120593 + 1205751 minus 119890(120593+120575)119872

119875 (120593) = 119890minus(119877+120593+120575)119872(9)







119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119894 120591119895) isin 119864 (10)


119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119895 120591119894) isin 119864 (11)




119897119891119905119878 (120591119894) = 119891119905 (120591119894) if 119904119906119888119888 (120591119894) and 119878119906119888119888119875 (120591119894) are empty

min120591119895isin119904119906119888119888(120591119894)or119878119906119888119888119875(120591119894)

119890119904119905119878 (120591119894) otherwise (12)



119904119897119886119888119896119879 (120591119894) = 119897119891119905119878 (120591119894) minus 119890119904119905119878 (120591119894) minus 119908119890119894119892ℎ119905 (120591119894) (13)


119904119897119886119888119896119879 (120591119894) =


(14)



120572 (120591119894) = 119904119897119886119888119896119879 (120591119894)119908119905 (120591119894) (15)




119879119882 (120591119894) (16)


Minimize 119872119904 (119879) (17)




































75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0


St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3


time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci


time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l


time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l




Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0






EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia





119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119894 120591119895) isin 119864 (10)


119901119903119890119889 (120591119894) = 119905119886119906119895 | 120591119895 isin 119881 and (120591119895 120591119894) isin 119864 (11)




119897119891119905119878 (120591119894) = 119891119905 (120591119894) if 119904119906119888119888 (120591119894) and 119878119906119888119888119875 (120591119894) are empty

min120591119895isin119904119906119888119888(120591119894)or119878119906119888119888119875(120591119894)

119890119904119905119878 (120591119894) otherwise (12)



119904119897119886119888119896119879 (120591119894) = 119897119891119905119878 (120591119894) minus 119890119904119905119878 (120591119894) minus 119908119890119894119892ℎ119905 (120591119894) (13)


119904119897119886119888119896119879 (120591119894) =


(14)



120572 (120591119894) = 119904119897119886119888119896119879 (120591119894)119908119905 (120591119894) (15)




119879119882 (120591119894) (16)


Minimize 119872119904 (119879) (17)




































75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0


St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3


time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci


time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l


time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l




Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0






EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia



































75 55 36 60 65 50 75

50 45 45 40

200

Weight rarr

Weight rarr

Weightrarr

() rarr

() rarr

() rarr

0 0 0 011 022 05 0

037 078 047 1

0


St(9)St(3)

St(11)

118

7

54 12

1 9

10

6

3

2P1

P2

P3


time

(a 1)

Pk Pl

1k

3k

2l

1 1

2

2

3 3

4

4

Pi Ci


time

(b 1) (b 2) (b 3)

X X X

R

R

R

Pk Pl Pk Pl Pk Pl

1k 1

k

3k

1 1 1 1

2 2

2

3

3

1k

1 1

2

2

3

4

4

Pi Ci Pi Ci Pi Ci

2l


time

(c 1) (c 2)

X X X X

R R

Pk Pl Pk Pl

3k

3

3

1k 1

k

1 11 1

22

4 4

Pi Ci Pi Ci

2l




Subtasks Weight 119897119891119905() 119890119904119905() 119878119905() 1198781199051015840() 120572()1205911 50 0 50 0 184 0371205912 75 0 75 0 0 01205913 45 50 130 35 35 0781205914 55 75 130 0 0 01205915 36 130 166 0 0 01205916 60 166 226 0 68 0111205917 45 166 211 0 212 0471205918 65 226 291 0 141 0221205919 40 211 291 40 40 112059110 200 166 366 0 0 012059111 50 291 366 25 25 0512059112 75 366 441 0 0 0






EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia







EdgeCount

Avg RunTime of Task

(Sec)

DeadlinesRange

Epigenomics

100 322

385651 10CP200 644 1000 3228











(18)



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia



ExCostR

ate

14

13

12


MakDelRa

te

13

12

11


(a) (b)


MakDelRa

te 14

12

FAUSIT



(a) MakDelRate

ExCostR

ate

135

13

12

125


FAUSIT


(b) ExCostRate




(19)













6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia






6 Conclusion


Data Availability




Acknowledgments


References

































RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia





















RoboticsJournal of




VLSI Design



Shock and Vibration







Journal of



Volume 2018



Volume 2018


Journal of




SensorsJournal of



RotatingMachinery





Propagation






Hindawi


Advances in

Multimedia


a selective mirrored task based fault tolerance mechanism...

Documents