follow-me cloud: when cloud services follow mobile users...either build on the locator-identiﬁer...

2168-7161 (c) 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCC.2016.2525987, IEEETransactions on Cloud Computing

1

Follow-Me Cloud: When Cloud Services FollowMobile Users

Tarik Taleb, Adlen Ksentini, and Pantelis A. Frangoudis

F

Abstract—The trend towards the cloudification of the 3GPP LTE mobilenetwork architecture and the emergence of federated cloud infrastruc-tures call for alternative service delivery strategies for improved userexperience and efficient resource utilization. We propose Follow-MeCloud (FMC), a design tailored to this environment, but with a broaderapplicability, which allows mobile users to always be connected via theoptimal data anchor and mobility gateways, while cloud-based servicesfollow them and are delivered via the optimal service point inside thecloud infrastructure. FMC applies a Markov-Decision-Process-basedalgorithm for cost-effective, performance-optimized service migrationdecisions, while two alternative schemes to ensure service continuityand disruption-free operation are proposed, based on either SoftwareDefined Networking technologies or the Locator/Identifier SeparationProtocol. Numerical results from our analytic model for FMC, as wellas testbed experiments with the two alternative FMC implementationswe have developed, demonstrate quantitatively and qualitatively theadvantages it can bring about.

1 INTRODUCTIONTo cope with the explosive growth in mobile data traf-fic [1], which challenges both their core and radio net-works, mobile operators are pushing towards new archi-tectural solutions to decentralize the user plane of theirnetworks. Such approaches involve moving data anchorgateways towards the edge of the network and carefullyserving IP traffic via selected points close to their RadioAccess Network (RAN) nodes by mobile data offloadingtechniques [2]. At the same time, computation offloadingover heterogeneous wireless network infrastructures alsoattracts attention, in view of the availability of cloudcomputing resources [3].

At the service delivery end, the success of cloudcomputing has led Content/Service Providers to con-sider deploying more regional Data Centers (DCs). Fur-thermore, the dependence of content providers (CPs)and Internet Service Providers (ISPs) on one anotherfor efficient content/service delivery and disruption-freenetwork operation in view of dynamic shifts in trafficdemands creates CP-ISP cooperation incentives [4] forjoint deployment of network-aware content and servicedelivery infrastructures, and cloud computing technolo-gies are considered for their implementation.

T. Taleb is with Aalto University, Finland.E-mail: [email protected]. Ksentini and P.A. Frangoudis are with IRISA/Inria Rennes-BretagneAltantique.E-mail: [email protected], [email protected]

Importantly, to more efficiently address the needs ofmobile users in terms of geographical coverage andproximity of DCs to them, a new means of coopera-tive service deployment has emerged in the form of anetworked federated cloud [5]. This involves allocatingvirtual resources on a number of DCs dispersed overa specific geographical area, over the infrastructure ofpotentially heterogeneous federated cloud providers ina transparent manner.

The availability of regional resources and the flexibilityof the virtualization technologies upon which federatedclouds are built are particularly important to support themodern trend of cloudifying the mobile network infras-tructure and offering mobile services in an elastic man-ner, following user demand and presence. In the contextof the 3rd Generation Partnership Project (3GPP) LongTerm Evolution/Evolved Packet System (LTE/EPS) [6],a decentralized mobile network architecture would in-clude core network gateways such as Packed Data Net-work Gateways (PGWs) and Serving Gateways (SGWs)operating as Virtualized Network Function (VNF) instanceson the cloud [7], [8], and not necessarily running on topof specialized dedicated hardware.

At the same time, it is important to ensure that usersconnected to the mobile core network through a 3GPPbase station (eNodeB) or using non-3GPP access, suchas Wi-Fi, enjoy acceptable Quality of Experience (QoE)by always guaranteeing optimal end-to-end connectivityfor the services offered over the federated cloud, duringthe entire course of service consumption. Here lies themain challenge we address in this work: While userconnectivity to the mobile data anchor gateway is alwaysoptimal, this is not necessarily the case for the end-to-endmobile service delivery, since a user on the move maykeep receiving the service from a distant (suboptimal)DC after moving to different physical locations.

To answer to this challenge, we introduced the con-cept of the Follow-Me Cloud (FMC) [9], a design tai-lored to an interoperating decentralized mobile net-work/federated cloud architecture. FMC allows not onlythe content, but also the service itself, to follow amobile user while moving, ensuring that the latter isalways connected to the optimal data anchor and mo-bility gateways, at the same time accessing a cloud-based service from the optimal DC, in terms either



2

of geographical/network-topological proximity or anyother service- or network-level metric, such as load,service delay, etc.

To realize the FMC vision, service continuity andsophisticated schemes for service migration decisionsacross DCs are critical. In this article, we present acomplete framework defining FMC from architectural,algorithmic and implementation perspectives. We ex-plore alternative schemes for ensuring service continuity,which either build on the Locator-Identifier SeparationProtocol (LISP) [10], [11] or on using Software-DefinedNetworking (SDN) technologies. We further propose aMarkov Decision Process (MDP)-based algorithm [12]for optimally performing service migration decisions,taking into account user mobility information, and ad-dressing the tradeoff between migration cost and userexperience. Our testbed implementation of the proposedarchitectural alternatives, coupled with an analytic per-formance evaluation of our system, serve to demonstratethe feasibility and advantages of FMC, and shed light onthe practical aspects of its deployment.

The remainder of this article is structured as follows.In Section 2 we provide an overview of related work. Wepresent the FMC concept, entities and high-level func-tionality in Section 3. Section 4 introduces an analyticmodel which captures the tradeoff between the benefitand cost due to service migration, and Section 5 presentsan MDP scheme building on this model. We describe aLISP-based and an SDN-based implementation of FMCin Section 6, and present analytic and testbed-basedperformance results in Section 7, before we conclude thearticle in Section 8.

2 RELATED WORK

2.1 Service continuity for mobile usersIn the mobile networking context that we position ourwork, a major and well-studied challenge is maintainingservice continuity during user and, importantly, servicemobility. An approach to this problem is decouplingsession and location identifiers. A protocol which makesthis separation explicit is LISP (see Section 2.3), and weapply it in this work.

Nordstrom et al. [13], on the other hand, presentServal, a networking stack which includes a new ser-vice access layer to cater for user and service mobility,providing identifier/location separation. It makes use ofservice identifiers, which would however require modi-fications to legacy applications to support the proposedfunctionality.

Other research works have considered the use ofOpenFlow to hide, through its rules, any changes to IPaddresses. OpenFlow-based solutions often face scalabil-ity challenges wrt. the number of flows, number of rules,flow setup rate, bandwidth of the control channel, etc.To reduce the number of control packets, DevoFlow [14]moves some of the flow creation work from controllers toswitches. Bifulco et al. [15] propose to distribute control

plane functions, in order to enhance system scalability,which is an approach that our SDN-based design (seeSection 6.1) could follow.

2.2 Service migrationIn a federated cloud context [5], where geographicallydistributed DCs are connected into a common resourcepool, a cloud management procedure for directing ser-vice requests to the optimal DCs, satisfying resource,cost, and quality constraints is necessary. If the respectivecriteria/constraints are not covered, services may needto be migrated across DCs. Malet and Pietzuch [16]propose a cloud management middleware for migrat-ing part of a user’s service (represented by a set ofvirtual machines) between DC sites in response to DCworkload variations and in order to move applicationcomponents geographically closer to users. Agarwal etal. [17] present Volley, an automatic service placementscheme for geographically distributed DCs based oniterative optimization algorithms, which performs ser-vice migrations when detecting that DC capacity or userlocation change. Alicherry and Lakshman [18] proposea DC selection algorithm for placing a virtual machine(VM), modeling the problem as a sub-graph selectionone, while Steiner et al. [19] demonstrate how servicescan be placed based on information retrieved from anALTO (Application-Layer Traffic Optimization) server.The above works mainly focus on the VM migrationprocess rather than on VM mobility management.

Other works [20], [21] integrate IP mobility manage-ment directly into the hypervisor, which interacts with aVM before and after its migration to update IP addressesin the VM’s routing table, or, as in the work of Li et.al [22], invokes Mobile IP functionality each time a VM iscreated, destroyed or migrated. These solutions performlive VM migration at the expense of potentially longdowntimes. Raad et al. [23], on the other hand, achievesub-second downtimes using a modified version of LISPfor rapid traffic redirection. Contrary to our approach(see Section 6.2.2), their scheme also requires modifica-tions to the hypervisor, raising deployment issues.

2.3 The Locator-Identifier Separation Protocol(LISP)With location and identity traditionally coupled, IP mo-bility becomes a challenging task. To this end, LISP [10]separates them using Routing Locators (RLOCs) andEndpoint Identifiers (EIDs). LISP does not impose anyconstraints on the EID and RLOC identifier format; IPaddresses are typically used. RLOCs are needed to for-ward packets to/from the Internet, while EIDs are localto an IP subnet. At the data plane level, LISP maps theEID address to an RLOC, and encapsulates the packetsinto other IP packets before forwarding them throughthe IP transit. Usually, a LISP site is managed by atleast one tunneling LISP router (xTR), having two func-tionalities: IP packet encapsulation (packet received by a



3

terminal; ingress functionality, or ITR) and decapsulation(packet received by the network; egress functionality, orETR).

To guarantee EID reachability, the LISP mapping sys-tem includes a Map Resolver (MR), a Map Server (MS),and a cache table at each xTR. When a station hasa packet to transmit, the EID of the remote stationis used in the destination address. Once reaching theITR (ingress part of xTR), the latter encapsulates thetransmitted packet by adding three headers (LISP, UDP,and IP) and fixing the fields “Source Routing Locator”and “Destination Routing Locator” of the LISP headerto the source and destination xTR RLOCs, respectively.The mapping between the EID and the correspondingdestination xTR RLOC is first looked up in the localcache. If lookup fails, a Map Request message is sent tothe Map Resolver, which responds with a Map Reply ifthe mapping is found. Otherwise, it redirects this requestto the Map Server. The MS searches in its local databaseto find an xTR that would correspond to this EID, andreplies with a Map Reply if it exists. Otherwise, it replieswith a Negative Map Reply. Note that the MS receivesMap Register messages from ETRs and registers EID-to-RLOC mappings in the mapping database.

Compared to mobile IP, LISP avoids triangular routingthanks to decoupling locations and identifiers. A stationcan move to another location without changing its EID;only the RLOC has to be updated at the MS/MR. Fur-thermore, with few modifications, LISP can help achieveshort VM migration downtimes [23].

2.4 Our own prior work

This article extends, generalizes and refines our Follow-Me Cloud concept [9], presenting an evolved designtargeting generic decentralized mobile network archi-tectures and making heavier use of NFV technologies,bringing the service closer to the end user. We fur-ther complement our LISP-based implementation of thisscheme [11], which we have updated to match ourevolved FMC design, with an SDN-based one. Finally,on the theoretical front, we extend our service migra-tion decision algorithm [12] to also capture 2D mobilityscenarios; our algorithm builds on our analytic modelpresented in [24], included here for completeness.

3 THE FOLLOW-ME CLOUD CONCEPT

3.1 High-level design

In this section, we present the concept and main func-tionality of our Follow-Me Cloud (FMC) design for op-timized, disruption-free cloud-based services for mobileusers. Our high-level architecture is shown in Fig. 1. Thetwo main components of our scheme are the FMC con-troller (FMCC) and the DC/GW mapping entity. Thesecan either be two independent architectural components,two functional entities collocated with existing nodes, orrun as a software on any DC of the underlying cloud.

FMC was designed with the 3GPP LTE/EPS archi-tecture in mind, but is generic and can be applied toother decentralized mobile network access schemes. Weassume that DCs are mapped to a set of data anchorrouters/gateways. In an LTE context, these routers arePGWs, while, for users roaming across federated Wi-Fihotspots, such as the Fon network [25], the data anchorrouter can be the access router of the ISP to which apublic Wi-Fi access point is connected. Depending on themobile access architecture, other options are possible.

The DC-gateway mapping is based on some criterion,such as location or hop-count distance. This mappingmay be static or dynamic. In the latter case, topologyinformation can be exchanged between the FMC serviceprovider and the Mobile Network Operator (MNO).Alternatively, an MNO function could be in charge ofupdating the FMC service provider with such informa-tion either in a reactive or a proactive manner. Note,furthermore, that our design includes a single FMCCfor managing distributed DC instances, but this does notpreclude a decentralized, self-organized implementationfor distributed DC coordination.

Taking advantage of virtualization technologies at thedata anchor end, our design extends our first versionof the FMC architecture [9] with the capability to serveusers directly from the data anchor router, bringingservices closer to the user end. We distinguish betweentwo types of DCs. At a macroscopic level, there are thecore (macro) DCs, which can be considered as data originservers. At a microscopic level, caches implemented atthe data anchor gateways operate as micro-DCs to servemobile users more efficiently. The focus of the macro-DC level is persistent service (VM) storage and serviceinstantiation. The FMC functionality is implementedboth at UEs and at the micro-DC level and is responsiblefor service identification and migration procedures. Asshown in Fig. 1, a two-level mapping takes place: Macro-DCs are mapped to a group of micro-DCs, each of whichis in turn mapped to one or more data anchor routers.Note that these data anchors can be implemented asVNFs hosted in the federated cloud (e.g., collocated withtheir corresponding micro-DC).

Our design allows for various strategies for selectingwhich VMs to cache at micro-DCs, as well as for decidingwhether a service component will be migrated or repli-cated at another data anchor following user mobility.However, such strategies are outside the scope of thiswork; for simplicity and presentation clarity, we assumethat a service is deployed at the data anchor routerwhere the user is attached to upon service initiation,and is migrated (i.e., no replication takes place) as a usermoves.

From this point on, unless otherwise noted, the termDC will refer to a micro-DC (µ-DC).

3.2 Service migration processWith the IP address change which takes place when aUE changes its data anchor router (e.g., PGW reloca-



4

Fig. 1. The FMC high-level architecture in a federatedcloud and distributed mobile network environment.

tion in a 3GPP LTE/EPS mobile network architecture),there is a potential need for an FMC service migration.This change can be detected by the serving micro-DC.Whether service migration is worthwhile depends on theservice type and requirements (e.g., an ongoing videoservice with strict QoS requirements may be migrated;a delay-sensitive measurement task for an emergencywarning Machine Type Communications service mustalways be migrated to the optimal DC), content size (e.g.,the movie a user is watching is about to finish at thetime of PGW relocation; the UE FMC application layerdecides not to initiate service migration), and/or userclass. The migration decision relies on several poten-tially conflicting criteria related with user expectationsabout the service (QoS/QoE, cost) and network/cloudprovider policies (load balancing, maximizing DC re-source utilization, micro-DC capacity, etc).

Once the UE or the current micro-DC consider appro-priate to migrate the service, the FMC plugin availableat the micro-DC may request the FMCC to select the op-timal micro-DC to initiate the service migration to. As aservice may consist of multiple cooperating components,potentially residing at different locations, the decisionhas to be made indicating whether the service has tobe fully or partially migrated, while considering theservice migration cost, e.g., the cost associated with theinitiation/replication of a new VM at the target micro-DC, with the release of resources at the source DC, orwith the bandwidth consumption due to traffic beingexchanged between the DCs and/or the FMCC. Anestimate of these costs shall be compared to the benefitsfor the (federated) cloud in terms of traffic distribution,but also to those for end users in terms of QoE.

4 AN ANALYTIC MODEL FOR FMCIn this section we propose an analytic model to establishthe relationship and the relevant tradeoff between FMCservice migration cost and benefits in terms of userexperience. This model provides insights upon which

(a) 1D (linear) model (b) 2D model

C4

C3

C2

C1

C0

C4,21

C4,20

C4,19

C4,18

C4,17

C4,22

C4,23

C4,24

C4,1

C3,1

C4,2

C3,18

C3,17

C3,16

C3,15

C3,14

C3,13

C4,16

C2,11

C2,10

C2,9

C3,12

C4,15

C4,4

C4,3

C3,2

C3,3

C2,1

C2,12

C1,1

C1,6

C1,5

C2,8

C3,11

C4,14

C1,4

C2,7

C3,10

C4,13

C2,2

C1,2

C1,3

C2,6

C3,9

C4,12

C2,3

C2,4

C2,5

C3,8

C4,11

C3,4

C3,5

C3,6

C3,7

C4,10

C4,5

C4,6

C4,7

C4,8

C4,9

Fig. 2. A typical 3GPP cellular network.

we base a Markov Decision Process algorithm to deriveoptimal migration policies (see Section 5).

4.1 Markov-based system model

We use Markovian models to represent our system, aim-ing to be able to derive the UE position wrt. the servingDC and predict system evolution. Here, we focus ourdiscussion on a 3GPP LTE mobile network environment.A 3GPP network is typically divided into hexagonal cells(Fig. 2). For the sake of simplicity, we assume that micro-DCs and data anchor routers (PGWs) are collocated witheNodeBs. In a real implementation, a micro-DC could bemapped to a set of PGWs, which are in turn mapped toa pool of eNodeBs.

We consider a random walk mobility model, wherea UE can visit any of the six neighboring cells withprobability p = 1/6. The residence time of a UE in eachcell follows an exponential distribution with mean 1/µ.Fig. 2 shows a service area with k = 5 rings of cells.Service migration and data anchor gateway relocationare triggered for a UE when its location is k hops awayfrom the serving DC (assumed to be collocated witheNodeBs). Let X(t) denote the distance of the UE tothe serving DC (in number of hops) at the time instantt. The system {X(t), t ≥ 0} forms a Continuous-TimeMarkov Chain (CTMC), with the state space {C(i,j)|0 ≤i ≤ (k − 1), 1 ≤ j ≤ 6i}.

This chain faces a state space explosion problem, es-pecially if k is high. To address this problem, we reducethe state space by aggregating states that show the samebehavior [26], [27], [24], to obtain a new chain A(t) witha lower number of states. In Fig. 2, we see that UEsin ring 0 can move to any neighboring cell with thesame probability. UEs in ring 1 come back to the cellwhich hosts the serving DC with probability p, stayin the same ring (same distance from the serving DC)with probability 2p, and move to ring 2, increasing thedistance from the DC, with probability 3p. Consequently,all states of ring 1 can be aggregated into one state.Regarding ring 2, there are two groups of cells: (i) Cellsneighboring three ring-3, two ring-2, and one ring-1 cells,and (ii) cells having two neighbors at each of the 3 rings.Depending on the ring-2 cell the UE is located, e.g., itmay either have 3p or 2p probability to move to ring 3.



5

0 1 2 3 4

2(1) 3(1) 4(1)

4(2)

t1 t2t2

t2

t2t2

t2

t2

t3 t2

t2

t2

t2t2

t2t3t3

t3t3t3

t3 t2

t3

t3

t4t3

t4

t5

t3t3

t2

t1=6pμt2=pμt3=2pμt4=3pμt5=pμ

Fig. 3. Markov chain for k = 5

Therefore, we obtain two aggregated states: State C2 ag-gregates states{C2,1, C2,3, C2,5, C2,7, C2,9, C2,11} and stateC

(1)2 aggregates states {C2,2, C2,4, C2,6, C2,8, C2,10, C2,12}.

The same rationale is applied to obtain aggregated statesCi and C

(m)i for any ring i, where 1 ≤ m ≤

⌈i−1

2

⌉is the

identifier of an aggregated state within a ring.As proven by Langar et al. [26], the new aggregated

chain A(t), derived from the initial Markov chain X(t),is also Markovian. Fig. 3 shows the transition diagram ofthe aggregated Markov chain when the service migrationis triggered when the UE is k = 5 hops away fromthe serving DC. Based on this figure, we can derive thesteady state probabilities of the aggregated states. Forsimplicity, each aggregated state of ring i in Fig. 3 islabeled using the ring number and the superscript usedto identify different aggregate states of the same ring,if necessary. The balance equations (Eq. (1)-(6)) to solvethe system follow1:

π0 = 16π1 +

12πk−1 +

13

d k−22 e∑j=1

π(j)k−1

π1 = π0 +13π1 +

16π2 +

13π

(1)2

π2 = 16π1 +

16π3 +

13π

(1)2 + 1

6π(1)3

πk−1 = 16πk−2 +

16π

(1)k−1

πi =16πi−1 +

16πi+1 +

16π

(1)i−1 +

16π

(1)i+1,∀3 ≤ i ≤ k − 2

(1)

π(1)2 = 1

3π1 +13π2 +

16π

(1)3

π(1)3 = 1

3π2 +13π3 +

13π

(1)2 + 1

6π(1)3 + 1

6π(1)4 + 1

3π(2)4

π(1)4 = 1

3π3 +13π4 +

16π

(1)3 + 1

6π(1)5 + 1

3π(2)4 + 1

6π(2)5

π(1)k−1 = 1

3πk−2 +13πk−1 +

16π

(1)k−2 +

16π

(2)k−1

π(1)i = 1

3πi−1 +13πi +

16π

(1)i−1 +

16π

(1)i+1 +

16π

(2)i + 1

6π(2)i+1,

∀5 ≤ i ≤ k − 2(2)

π(j)i = 1

6π(j−1)i + b1

6 π(j+1)i + 1

6π(j−1)i−1 + 1

6π(j)i−1

+ b26 π

(j)i+1 +

b26 π

(j+1)i+1 ,∀6 < i < k − 1 and 2 ≤ j ≤

⌈i−1

2

⌉− 1

(3)

1. For details on the state aggregation algorithm and more insighton the derivation of the balance equations for the reduced CTMC, thereader is referred to the work of Langar et al. [26].

where

b1 =

1 if i is odd1 if i is even and 2 ≤ j ≤

⌈i−1

2

⌉− 2

2 if i is even and j =⌈i−1

2

⌉− 1

and

b2 =

{0 if 6 ≤ i ≤ k − 21 if i = k − 1{

π(l)2l = 1

6π(l−1)2l + 1

6π(l−1)2l−1 + c1

6 π(l)2l+1, ∀2 ≤ l ≤

⌈k−1

2

⌉(4)

where

c1 =

{0 if l = k−1

21 otherwise

π(l)2l+1 = 1

6π(l−1)2l+1 + 1

6π(l)2l+1 +

16π

(l−1)2l + 1

6π(l)2l

+ c26 π

(l)2l+2 +

c26 π

(l+1)2l+2 ,

∀2 ≤ l ≤ k−22

(5)

where

c2 =

{0 if l = k−2

21 otherwise

k−1∑i=0

πi +k−1∑i=2

d i−12 e∑

m=1π

(m)i = 1 (6)

4.2 Average UE-DC distance and the probability tobe connected to the optimal DC

Let E[Dist] denote the average distance of a UE fromthe serving DC. E[Dist] depends on the value of k, andthe distance (number of hops) of the UE from the dataanchor router collocated with the serving DC. Recallthat a UE remains connected to this anchor and all dataare consequently routed through the latter until servicemigration is triggered. Therefore, the average distance isexpressed as:

E[Dist] =k−1∑i=1

iπi +k−1∑i=2

d k−22 e∑j=1

iπ(j)i . (7)

The probability that the UE is connected to the optimalDC during the system’s lifetime is π0.

4.3 Average end-to-end delay from the serving DC

We define the end-to-end (e2e) delay as the delay for aUE to receive data packets from the serving DC. Similarto E[Dist], the e2e delay depends on the UE distance(number of hops) to the data anchor router connectingto the DC. The average e2e delay is denoted by E[D]and is given by:

E[D] =k−1∑i=1

Diπi +k−1∑i=2

d k−22 e∑j=1

Diπ(j)i , (8)

where Di is the e2e delay when the UE is at distance i(cells belonging to ring i).



6

4.4 Service migration costMC denotes the cost of migrating part (i.e., some ofthe components/sessions composing it) or all the servicefrom a DC to the optimal one. It depends on the sizeof the objects to be migrated, as well as the amountof signaling messages exchanged among the FMCC, theUE and the DCs. In FMC, there are three signalingmessages to trigger service migration. The cost for aservice migration thus follows:

Cost = Objectssize + 3SIGsize, (9)

where SIGsize is the signaling message size. Hence, MCcan be derived as follows:

MC =

3pπk−1 + 2p(

d k−22 e∑j=1

π(j)k−1)

× Cost. (10)

4.5 Service migration durationThe service migration duration is the time required totransfer part or all of the service from the current DC tothe optimal one. It mainly depends on: (i) the size of theobjects to transfer; (ii) the RTT of the TCP connectionbetween the two DCs; and (iii) the time needed toconvert a VM to the appropriate format, if the two DCsare not using the same hypervisor. It also represents thetime when the service cannot be used, in other words,service disruption time (denoted as SDT ). Assumingthat the data transfer is based on an FTP-like application,we use the empirical TCP latency model of Sikdar etal. [28], and the SDT value can be computed as follows:

SDT = [log1.57N + f(ploss, RTT )N + 4ploss log1.57N

+ 20ploss +(10 + 3RTT )

4(1− ploss)Wmax

√Wmax

N ]RTT

+ TVM conversion

(11)where ploss denotes the packet loss rate, N is thenumber of packets to transfer, Wmax is the maximumsize of the congestion window, TVM conversion is thetime required to convert a VM and f(ploss, RTT ) =2.32(2ploss+4p2loss+16p3loss)

(1+RTT )3 N + (1+ploss)RTT103 .

Note that N =⌈Servicesize

MSS

⌉, where MSS is the maximum

segment size used by the TCP connection.

5 A MDP-BASED SCHEME FOR SERVICE MI-GRATION

We model the service migration decision as a MarkovDecision Process (MDP), capturing the tradeoff betweenreducing cost and maintaining satisfactory user experi-ence. This model decides whether a service consumedby a user at distance d from the current DC should bemigrated to an optimal DC, a decision process carriedout by the FMCC. To formulate the service migrationdecision policy, we define a Continuous Time MarkovDecision Process (CTMDP) that associates to each state

an action, corresponding transition probabilities, andrewards.

Let st be the process describing the evolution of thesystem state and S denote the state space. We assumethe cellular network topology model of Fig. 2. Each cellbelongs to a Tracking Area (TA) and each TA belongs toa Service Area (SA), which is served by one anchor gate-way (PGW or access router). Fig. 4 shows the CTMDPfor the case of a one-dimensional (1D) mobility model:A mobile user has only two possible destinations, i.e., anew SA with probability p, or moving back to a visitedSA with probability 1 − p. Higher values of p indicatethat a user is moving far from the current DC. Fig. 5illustrates the case for the 2D mobility model describedin the previous section. The vector A = (a1, a2) describesthe actions available to the FMCC at each epoch (i.e.,when a UE performs handoff and enters into anotherSA). Action a2 is used if the service is migrated to an op-timal DC, while action a1 is used if the UE is still servedby the same DC. Depending on the current state, the setof available actions differs. For the sake of simplicity,we demonstrate the use of MDP to solve the servicemigration problem for 1D mobility. The same reasoningcan be applied to the case for the 2D model shownin Fig. 5. Note that, albeit its simplicity, the 1D modelis appropriate for vehicular networking environmentswhere users move along predefined trajectories, such ashighways or railway tracks.

In the 1D mobility model, the residence time of auser in each SA follows an exponential distributionwith mean 1/µ. Hence, the state space S is defined asS = {0, 1, . . . , thr}, where thr represents the maximumdistance (in terms of visited SAs) from where the servicemust be migrated to the optimal DC.

0 1 2 3 thr-1 thr …..

µ, a1

(1-p)µ, a1

pµ, a1 pµ, a1 pµ, a1 pµ, a1

pµ, a1

(1-p)µ, a1 (1-p)µ, a1 (1-p)µ, a1 (1-p)µ, a1

µ, a2 µ, a2 µ, a2 µ, a2

µ, a2

Fig. 4. CTMDP of the service migration procedure: 1Dmobility model.

The FMCC observes the current state s of the networkand associates a set of possible actions As to it, takenupon arrival to it from the previous state. For a givenaction a, an instantaneous reward r(s, s′, a) is associatedto a transition from state s to another state s′. Thecorresponding formal representation of the CTMDP isas follows:

(S, (As, s ∈ S), q(s′|s, a), r(s, s′, a)).

For particular states, the set of possible actions Areduces to a subset As. A policy P associates an actiona(s|P ) to a state s. Let Q be the transition matrix, with



7

0 1 2 3 4

2(1) 3(1) 4(1)

4(2)

t1 t2t2

t2

t2t2

t2

t2

t3 t2

t2

t2

t2t2

t2t3t3

t3t3t3

t3 t2

t3

t3

t4 or t6

t4 or t6

t4 or t6

t6

t6

t6t6

t5 or t6

t3t3

t2

t1=6pμ,a1t2=pμ,a1t3=2pμ,a1t4=3pμ,a1t5=pμ,a1t6=μ,a2

Fig. 5. CTMDP of the service migration procedure: 2Dmobility model.

q(s′|s, a) denoting the transition rate between states sand s′ in S due to action a, which, in the FMC scenario,represents a UE moving from one SA to another SA.By construction, we define a policy as a function of theactual state. The decision to migrate a service or notis taken by observing only the actual state. Since thisprocess is Markovian (the sojourn time in a SA followsan exponential distribution), the controlled process isalso Markovian. To resolve the MDP, we use an equiva-lent Discrete Time Markov Decision Process (DTMDP)for the defined CTMDP, with a finite state space S.We argue that the state space is finite, since, after acertain distance (thr) from the current DC, the serviceis automatically migrated to the optimal DC. For eachs ∈ S, As represents the finite set of allowed actions inthat state. This DTMDP can be derived by uniformiza-tion and discretization of the initial process as follows:When all transition rates in matrix Q are bounded, thesojourn times in all states are exponential with boundedparameters tr(s|s, a). Therefore, a sup(s∈S,a∈As)tr(s|s, a)exists and there is a constant value c such that

sup(s∈S,a∈As)[1− p(s, a)]tr(s|s, a) ≤ c <∞,

where p(s|s, a) denotes the probabilities of staying in thesame state after the next event. We can now define anequivalent uniformized process with state-independentexponential sojourn times with parameter c, and transi-tion probabilities:

p(s′|s, a) =

{1− ([1−p(s|s)]tr(s|s,a))

c s = s′

p(s′|s)tr(s′|s,a)c s 6= s′

.

By setting c = µ, the transition probabilities of theDTMDP procedure are defined as follows:

p(j|s, a) =

1 j = 0, s 6= 0, s 6= thr, a = a1

or j = 1, s = 0, a = a2

p j = s+ 1, s 6= 0, s 6= thr, a = a1

or j = 0, s = thr, a = a1

1− p j = s− 1, s 6= 0, a = a1

0 Otherwise

.

It is important to note that when the system is in states = thr, the only available action is a1. If the UE movesto another SA, where the distance exceeds the threshold

thr, service migration is automatically triggered.In the remainder of this section, we use the

DTMDP version. For t ∈ N , let st, at and rtdenote the state, action and reward at timet of the DTMDP procedure, respectively. LetP a(s,s′) = p

[s(t+1) = s′|st = s, s(t+1) = s′, at = a

]denote the transition probabilities and Ra(s,s′) =

E[r(t+1)|st = s, s(t+1) = s′, at = a

]denote the expected

reward associated with the transitions. A policy π isa mapping between a state and an action, and can bedenoted as at = π(st), where t ∈ N . Accordingly, apolicy π = (θ1, θ2, θ3, . . . , θN ) is a sequence of decisionrules to be used at all decision epochs. We restrictourselves only to deterministic policies, as they aresimple to implement [29]. When a UE hands off aparticular SA to another SA, the FMCC has to decideeither to migrate the service using action a2 or not tomigrate it using action a1. For each transition, a rewardis obtained. This reward is a function of the cost ofmigrating a service (zero in case of no migration) andthe quality obtained from the new state. The cost ofmigrating a service is defined as follows:

g(a) =

{0 a = a1

Cm a = a2

where Cm denotes the cost of migrating all the serviceor a part of it. Therefore the reward function is given by

r(s, s′, a) = Q(s′)− g(a),

where Q(s) quantifies the quality perceived by a userconnected to the source DC when at state s. Note thatquality is inversely proportional to the hop distance fromthe source DC, and Q(0) is the maximum quality thata UE can enjoy, when connected to the optimal DC. Ingeneral, the quality function can be expressed in the formQ(s) = Q(0) − K, where K denotes a predeterminedfactor. Given a discount factor γ ∈ [0, 1] and an initialstate s, we define the total discount reward policy π =(θ1, θ2, θ3, . . . , θN ) as

vπγ = limN→∞Eπγ {∑Nt=1 γ

t−1rt} = Eπγ {∑∞t=1 γ

t−1rt}.

Given the uniformization of the CTMDP, r(s, s′, a) ex-plicitly depends on the transitions between states. Thenew reward function r′(s, s′, a) is obtained as fol-lows [29]:

r′(s, s′, a) = r(s, s′, a)α+β(s,s′,a)α+c ,

where β(s, s′, a) is the transition rate between states sand s′ when using action a, and α is a predeterminedparameter. With the new formulation of the rewardfunction and the uniformization of CTMDP, we can usethe discounted models as in discrete models to resolvethe system [29]. Let v(s) = maxπ∈Πv

π(s) denote themaximum discounted total reward, given the initial states. From [29], the optimality equations are given by

v(s) = maxπ∈Π{r′(s, s′, a) +∑s′∈S γP [s′|s, a] v(s′)}.

The solutions of the optimality equations correspond to



8

Fig. 6. OpenFlow-based Follow-Me Cloud architecture.

the maximum expected discounted total reward v(s) andthe optimal policy π∗(s). It is worth mentioning that theoptimal policy π∗(s) indicates the decision as to whichnetwork and which DC the UE is to be attached andto be connected, respectively, given the state s. Thereare several algorithms that can be used to solve theoptimization problem given by the above optimalityequations. Value iteration and policy iteration are twonoticeable examples [29]. In our case, we have used theformer.

6 IMPLEMENTATION ALTERNATIVES

To show the breadth of available options to realizethe FMC architecture, we explore alternative enablingtechnologies for its implementation. We have identifiedtwo candidate approaches, one based on SDN and oneon the LISP protocol. Note however that, as we haveshown [30], the FMC design can be integrated also withthe Mobile IPv6 protocol.

6.1 An OpenFlow-based FMC architecture

6.1.1 DesignOur SDN-based scheme is built on the NOX OpenFlowController [31] and is shown in Fig. 6. Mobile users ac-cess cloud-based services over a (mobile/wireless) clientnetwork and their traffic is forwarded by OpenFlow-capable access routers (OF AR x). Traffic from/to eachfederated cloud location (data center), inside which VMinstances interconnected via virtual switches are man-aged by local hypervisors, is routed through OpenFlowmicro-Data Center routers (OF µ-DC x). At the core ofthis architecture is our NOX-based FMCC, with whichthe components of our architecture communicate. Thecontroller is assumed to be aware of i) the virtual switchinstances and their data path identifiers on the physicalOpenFlow switch, ii) the VM identifiers (namely the IPand MAC addresses), iii) the location and IP addresses ofeach default gateway in the topology, iv) the OpenFlowswitch port identifiers at which the DC, router andclient networks are connected, v) the IP address rangesmanaged by each DHCP server both for client and

Fig. 7. Service migration procedure in an SDN-basedFMC implementation.

DC networks, and vi) the locations of distributed datacenters that can either be part of the operator network orcould be autonomous domains. The basic functionalitiesof the SDN-based FMCC follow.

Location Management: For correctly installing forward-ing rules into the OpenFlow switch, each client and VMare linked to home locations, based on their IP addressallocation and gateway settings. The current locationof a client in the client network is also maintained. Ifany traffic from a particular client or VM appears on adifferent network than the one corresponding to its homelocation, the FMCC updates the status of that entity toindicate that it is in a Visited Network/Location.

The location management process is also responsiblefor selecting the appropriate micro-DC location for theVM serving a client, utilizing information about thegeographic location of DCs and characteristics of theclient-VM/DC network paths, such as average delay andnetwork load or congestion.

Mobility detection and service migration: The actualVM migration is carried out by the cloud infrastructuresoftware. (See Section 6.1.2 for some details on our proof-of-concept implementation.) The sequence of messagesand events is shown in Fig. 7.

When a user changes its point of attachment, theOpenFlow Access Router on the visited network sendsa PacketIn OpenFlow message to the controller, whichis thus notified of the mobility event. This process canbe initiated when a user performs the DHCP messageexchange with the visited network. Then, the FMCCexecutes the service migration decision algorithm (seeSection 5), and, if migration is necessary, it installs theappropriate OF rule at the OF AR of the visited networkand launches the VM migration, also adding a rule to theOpenFlow router serving the micro-DC which hosts themigrated VM, so that when traffic for the latter reachesthe OpenFlow switch of a visited network, it can bematched in the flow table of that switch.

Session management: A key functional requirement isfor the controller to preserve all ongoing user sessionswhile a VM is migrated. This implies that no configura-tion change (e.g., IP address and gateway configurations)is allowed on the VM. Furthermore, the (private) IP ad-dress ranges managed by DCs can be overlapping, and



9

the first-hop setting may not be consistent across subnetboundaries. We address this by setting up a tunnelwithin the visited network segment. The tunnel operatesby rewriting the IP address field within the packet IPheader for each outgoing packet from the VM to theexternal network. The original IP header is restored inthe packet when the last hop in the visited networksegment is reached. The same technique is applied forall the incoming traffic to the VM. This is achieved bymodifying the set of OpenFlow rules installed in thevisited network.

6.1.2 Experimental setup

We have developed an experimental testbed implement-ing the SDN-based architecture of Fig. 6. Our setupincludes two DCs, each one implemented as a singleVMWare ESXi hypervisor. Each ESXi host is equippedwith two 1 Gbps Ethernet cards for forwarding manage-ment and OpenFlow traffic over the network. A virtualnetwork topology is defined inside the ESXi host by twovSwitches (soft switches), where each physical NIC isconnected with each soft switch instance. The ESXi hostmanages Windows XP VMs and each VM is configuredwith two virtual NICs (vNIC) connected with the virtualnetwork through the soft switches. One vNIC carriesmanagement traffic (vmKernel) and the other carriesOpenFlow traffic. The storage space is shared betweenthe two DCs and is accessed by the standard iSCSIprotocol. The DCs are remotely managed by the VMWarevCenter software.

There are two separate WLANs for client connectivityand two Linux-based hosts act as first-hop routers forclient traffic, assigning client addresses using DHCP andrunning Linux traffic control (tc) for controlling pathcharacteristics (e.g., delay and available bandwidth, andthus congestion) between the two network segments. Inthis simplified setup, each DC host plays the role of amicro-DC mapped to a data anchor router (in our casethe WLAN first-hop router), and is placed in each ofthe two client networks (and served by the respectiverouter).

The NOX-based FMCC, as specified in Section 6.1,the Linux-based routers, ESXi hosts and the VMWarevCenter host are all connected to ports of an NEC IP-8800OpenFlow switch. From the physical OpenFlow switchperspective, four virtual switches (VLAN) are used forseparately carrying the traffic of the two data centers andthe client network. The FMCC manages the forwardingbehavior on the four VLAN’s and also monitors the pathcharacteristics between a DC and the client network forresource management optimizations. For triggering liveVM migration across DCs, the VMotion cloud infras-tructure technology and proprietary API from VMWareare used. VMotion traffic is mapped to the managementnetwork, while all active communication between theVM and remote users is managed by the OpenFlownetwork.

Fig. 8. A LISP-based Follow-Me Cloud architecture.

6.2 A LISP-based FMC architecture6.2.1 DesignOur second architectural alternative is based on the LISPprotocol. Each micro-DC is connected to the Internetthrough an xTR router. The client (mobile) networkdomain contains IP subnets interconnected through xTRrouters as well, and the architecture includes a LISPMR/MS element. All LISP entities (MR/MS and xTRs)are implemented as VNFs and are deployed on VMs inthe cloud. Note that, in practice, the xTR routers of themobile transport network domain could either be VNFsor be built directly on top of the data anchor routerhardware. This FMC architecture also includes an FMCCin the form of a VNF.

Fig. 8 shows the envisioned LISP-based FMC archi-tecture and scenario. A mobile user accessing a servicehosted at micro-DC1, and initially connected to Subnet1, moves to Subnet 2. The xTR router of Subnet 2 notifiesthe MR/MS about this mobility event. The MR/MSupdates its cache and informs the FMCC about the newlocation of the user. The FMCC executes the servicemigration decision algorithm (see Section 5) to decidewhether to migrate the user’s service (that is, the VM(s)hosting the service) to the micro-DC corresponding tothe new location of the user. If the decision is positive,the FMCC requests Hypervisor 1 to launch the migrationprocedure. Since the VM is migrated to Hypervisor 2,the xTR router of Subnet 4 is informed of the migrationevent, and the latter further informs the MR and the xTRrouter of Subnet 3 (as well as other xTR routers involvedin communication with the VM) accordingly. Finally,the MR/MS resolver updates its cache and notifies theFMCC about this change.

6.2.2 Support for service continuityTo ensure service continuity, a LISP-assisted live servicemigration mechanism should be capable of (i) maintain-ing the VM Endpoint identifier (EID) when migratingit from its current DC to the target one, (ii) updatingthe Routing Locator (RLOC) of the target xTR router toinclude the VM’s EID, (iii) informing the MR server andall xTR routers involved in the communication with themigrated VM to update the RLOC of the migrated VM,



10

and (iv) informing the old xTR router to erase the VMEID from its cache.

In this work, we assume that a VM’s EID is the first IPaddress it obtains, and that its RLOC is the IP address ofthe corresponding xTR router. Furthermore, we considerthat a VM’s EID is registered at the initial xTR routerwith a large IP subnet prefix (in our case, /24). The EID ismapped to the RLOC of the source xTR router at the MR,as well as in the caches of xTR routers communicatingwith the VM.

When the VM is migrated to the target hypervisor,the EID of the migrated VM should be maintained andthe destination xTR router has to be informed about themigration event. Different approaches exist to achievethis. In one approach [32], the xTR router does notbecome aware of the new VM until the latter initiatescommunication, when the xTR detects that the sourceIP address (migrated VM’s EID) is not belonging toits IP subnet. Although this solution does not requireany signaling messages, it can break the current VM’sconnections and thus does not ensure service continuity;if the VM has no packet to transmit, the current xTRrouter communicating with the VM may continue usingits old Routing Locator (RLOC). An alternative to thisapproach was proposed by Raad et al. [23], where LISP isused in the control plane to inform the source and targetxTR routers about a successful VM migration. A newChange Priority (CP) LISP message is introduced, whichallows (i) the target hypervisor to inform the target xTRrouter about the migration of a new VM to the target DC(including the VM’s EID), and (ii) to update the cache(RLOC-EID mapping) of other xTR routers. However,this solution requires modifying the hypervisor, makingit hard to implement and deploy, as the hypervisorsoftware is independent from the operator (as well asthe LISP domain).

In this article, we propose another approach (Fig. 9),where the FMCC informs both xTR routers (handlingthe involved DC domains) about the change in the VM’sRLOC. Since the FMC controller is integrated in the LISPdomain (it already communicates with MR/MS to trackuser locations), it could easily be made aware of the xTRrouter handling a DC domain. This could be obtained bysending a message to MR/MS to know the xTR routerhandling the DC’s IP domain.

Once the xTR router becomes aware of the migrationof a new VM to its local DC, it notifies the MR/MS toupdate its RLOC by including the migrated VM’s EID.The EID of the migrated VM is in the form of the initialIP address, but with a /32 prefix. Therefore, the RLOC ofthe target xTR router is mapped to both its subnet prefixand the VM’s EID prefix (/32). Furthermore, the formerxTR router erases the old EID-to-RLOC entry from itscache. To speed up traffic redirection, the source xTRrouter uses a new LISP message (as in [23]) to informthe other xTR router which was communicating with theconcerned VM so that the latter accordingly updates theVM’s RLOC. Note that the xTR router should maintain a

Fig. 9. Service migration procedure in a LISP-based FMCimplementation.

list of xTR routers involved with each active connection.

6.2.3 Experimental setupThe VNFs corresponding to the entities of our LISP-based FMC scheme are implemented using the Clickmodular router framework [33] and run as ClickOS [34]VMs deployed on a Xen hypervisor. DCs are emulatedusing kvm [35] to take advantage of the VM migrationoptions that it offers, as kvm can migrate only the RAMcontent between the involved DCs, while the hyper-visors have a shared storage via NFS. We emulate amobile user moving between two IP domains; wheneverthe user enters a new domain, service (VM) migrationis triggered. Each VM is Ubuntu 8.04, with 1 GB ofRAM and a one-core processor. DCs (kvm managers) runUbuntu 14.04 with 8 GB of RAM. We emulated varyinglatencies in the paths between DCs and between theFMCC and xTR routers using Linux tc.

7 PERFORMANCE EVALUATION

In this section we present a quantitative performanceevaluation of the proposed scheme. We begin with nu-merical results from our analytic model, followed bymeasurements carried out on our testbed implementa-tion of the SDN- and LISP-based FMC solutions.

7.1 Model-based performance resultsWe present numerical results obtained by resolving theMarkov model defined in Section 4.1. We evaluate theperformance of FMC in terms of the probability thatthe UE is connected to the optimal DC, the averagedistance of the UE from the optimal DC, the UE connec-tion latency, the service migration cost and the servicedisruption time during service migration.

We assume a reliable connection between DCs (zeropacket loss), the total size of the service to migrate (i.e.,the respective VM) is set to 1 Gb, and all DCs areassumed to use the same hypervisor; thus, there is noVM conversion cost when migrating a service. Note thatthe case for k = 7 corresponds to a situation where theFMC concept is not used, as k is unrealistically high; inpractice, the service area is typically limited to a modestvalue for k.



11

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

2 3 4 5 6 7

Probability

k

Prob. optimal DC

(a) π0

0.5

1

1.5

2

2.5

3

2 3 4 5 6 7

Hops

k

UE average distance

(b) Average distance

Fig. 10. Probability to be connected to the optimal DCand the average distance from it.

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

2 3 4 5 6 7

Sec

k

Average e2e delay

Fig. 11. Average latency

Fig. 10(a) and 10(b) show the probability of a UE to beconnected to an optimal DC, and the average distancefrom it for different values of k, respectively. We noticethat this probability is a decreasing function of k: Highprobability is obtained when the service migration istriggered after each UE handover, ensuring that the UE isalways connected to the optimal DC, while the oppositeeffect appears when delaying service migration to longerdistances. On the other hand, the average distance isan increasing function of k. Indeed, if service migrationis delayed, the UE is likely connected from a distancehigher than one hop. This average distance exceeds twohops when k is higher than 6.

In Fig. 11, we plot the average latency of the UEconnection for different values of k. Note that latencyis given by Lati = 0.02i2 (s), where i is the hop distancefrom the serving DC. Intuitively, the average latencyincreases with k. If we compare the cases of using k = 2and k = 7, we see a significant difference of about 200ms in delays.

Fig. 12 shows the service migration cost for differentvalues of k. We present results for migrating all, 50%,and 10% of the service. This cost is a decreasing functionof k, and it is high when service relocation is launchedat each UE handover. Furthermore, the highest cost isreached when migrating all the service, as it criticallydepends on the object size to migrate.

Fig. 13 depicts the service disruption time for differentvalues of k. We assume that RTT is proportional to thesquare of the hop distance i between two DCs, givenby RTTi = 0.01i2 (s). Similar to Fig. 12, we consideredthree service migration cases: migrating all, 50%, or 10%

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

2 3 4 5 6 7

Cost (Gb)

k

All the service50% of the service10% of the service

Fig. 12. Cost of service migration.

0 0.05 0.1

0.15 0.2

0.25 0.3

0.35 0.4

0.45 0.5

1 2 3 4 5 6 7

Sec

k

RTT

(a) RTT

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7

SDT (sec)

k

All the service50% of the service10% of the service

(b) SDT

Fig. 13. Service disruption time – RTT proportional to thesquare of the distance.

of the service. We clearly observe that the SDT valueis an increasing function of k. This is attributable to thefact that high values of k mean longer distances betweenthe concerned DCs, thus an increased value of RTT. Inaddition, we notice that the SDT value is the highestwhen migrating all the service, mainly due to the largersize of the objects to transfer. Notably, this difference isnot important for low values of k. Since SDT is highlydependent on the RTT, accelerating data transfers usingtools like FDT [36] is mandatory when the RTT is high.

7.2 Service migration policiesGiven the tradeoff between migration cost and userexperience improvement identified in Section 7.1, wedemonstrate the construction of service migration poli-cies obtained using a Matlab implementation [37] of theValue Iteration algorithm [29]. We set thr = 10, i.e., aservice migration is automatically triggered if the UE isat a distance higher than 10 (in terms of the number ofvisited SAs) from the source DC. The factor K of thequality function is arbitrarily set to 1. We introduce anew metric τ representing the ratio between the cost andthe maximum quality Q(0). Two scenarios are studied:• τ = 0.1, which represents a low cost compared to the

quality obtained if a service migration is launched.This could be the case when only a part of a serviceis migrated.

• τ = 0.5, which indicates that the cost is not negli-gible compared to the quality obtained if a servicemigration is launched.

Fig. 14(a) and 14(b) illustrate the optimal policy con-structions for the aforementioned scenarios. The inter-



12

p\d 1 2 3 4 5 6 7 8 9 100.1 C C C C C C C C C M0.2 C C C C C C C C M M0.3 C C C C C C C M M M0.4 C C C C C C M M M M0.5 C C C C C M M M M M0.6 C C C C C M M M M M0.7 C C C C M M M M M M0.8 C C C C M M M M M M0.9 C C C C M M M M M M1 C C C C M M M M M M

(a) τ = 0.1

p\d 1 2 3 4 5 6 7 8 9 100.1 C C C C C C C C C C0.2 C C C C C C C C C M0.3 C C C C C C C C M M0.4 C C C C C C M M M M0.5 C C C C C C M M M M0.6 C C C C C M M M M M0.7 C C C C C M M M M M0.8 C C C C C M M M M M0.9 C C C C C M M M M M1 C C C C M M M M M M

(b) τ = 0.5

Fig. 14. Optimal policy construction.

section between i and j represents the action (a=Ccontinuation, a=M migration of the service) to be takenby the FMC controller, where i is the distance from theserving DC and j is the probability p. We remark that theoptimal policy construction has a threshold-based form,since beyond a certain distance value, the recommendedaction is service migration. This distance is inverselyproportional to the probability p.

For instance, when p = 0.8, the proposed policyrecommends service migration, if d = 6 and d = 5,for the first and second scenarios, respectively, since ahigh p value means that the UE is moving far fromthe current DC, while low values indicate that the UEwill most likely remain in the vicinity of the currentDC, or will come back to the service area of the currentDC. Furthermore, we remark that τ has an impact onthe optimal policy construction, since there are lessservice migrations when τ is high. This is intuitive,as the incurred cost is not negligible compared to theachieved gain when migrating a service. In case of arandom walk (p = 0.5), for both scenarios, the optimalpolicy recommends service migration when the distanceexceeds 5, which represents a good tradeoff between costand quality.

7.3 Testbed experiments

7.3.1 OpenFlow-based implementation

We present early results obtained from experiments onour SDN-based FMC testbed described in Section 6.1.2.In particular, we focus on the evolution of service latencyduring and after the migration process, to demonstratethe advantages of FMC. To emulate increased servicelatency due to a user being served from a suboptimalDC, we introduced a 50 ms delay (round-trip) in thepath between the client network and the suboptimal DCusing Linux tc, while the RTT to the optimal DC was1 ms. In our experiment, the client was originally beingserved from the suboptimal DC and we monitored theRTT from the user terminal to the VM hosting the serviceusing ping. The red curve in Fig. 15 corresponds to a casewhen FMC is not used and the client keeps being servedsuboptimally, and thus the approximately 50 ms RTT.The green curve represents the case when FMC is active.When service migration is triggered, after a period ofinstability and a short-term increase in latency, the latter

converges to a value close to 1 ms, after migration hasbeen completed.

Note that the increased delay at the start of the exper-iment is mainly a result of the installation of OpenFlowrules when new traffic arrives at the FMCC. This impliesa scalability issue for large centralized FMC deploy-ments. Therefore, decentralization schemes, such as theone proposed by Bifulco et al. [15] could be consideredto this end.

Fig. 15. Client ping latency with and without FMC.

7.3.2 LISP-based implementationFor our LISP-based FMC design, we experimented withthe testbed described in Section 6.2.3. Our metrics of in-terest are (i) the service downtime duration, i.e., the timewhen the VM is not available, (ii) the RTT between themobile user and the remote VM, and (iii) the migrationduration.

0

50

100

150

200

250

300

350

400

0 20 40 60 80 100 120 140

RTT(ms)

Time(s)

RTT between migrated VM and mobile user

Migration starts Migration ends

Fig. 16. The RTT between the migrated VM and mobileuser.

Fig. 16 shows the instantaneous RTT between themigrated VM and the mobile user, measured using ping.Using Linux tc, we introduce a 10 ms RTT betweenthe DCs. The RTT between the FMCC and xTR1 isapproximately 1 ms, while the RTT between the FMCCand xTR2 is set to 10ms. At the start of the experiment,the user is connected to DC1, and enjoys good service



13

quality (RTT ≈ 12 ms). From t = 55 s, the user movesto another network and the measured RTT increases toapproximately 250 ms. The FMCC thus triggers servicemigration to DC2, which is considered the optimal one.Migration starts at t = 58 s and ends at t = 84 s.During this period, the overall service downtime is 7.5ms (not visible in the figure due to its short duration).This downtime is mainly due to the fact that the FMCCdoes not notify the involved xTRs about the change inthe VM’s EID until migration is completed. The kvmhypervisor at DC1 keeps the VM active until migrationis complete and the VM keeps sending back ICMP echoreplies from its original location (DC1). After migrationis complete, the user is served by DC2 (optimal), whichbrings RTT down to a much lower value.

30

40

50

60

70

80

90

100

110

0 20 40 60 80 100

Downtime(ms)

RTT(FMCC<->DC)(ms)

Case 1Case 2

Fig. 17. Downtime duration as a function of path la-tencies. Each point is the mean value of a few tens ofiterations.

We further quantify the dependence of the servicedowntime duration on the latencies in the paths betweenthe entities of our architecture. In Fig. 17, we show thedowntime duration when the VM is migrated from DC1to DC2 for different values of the RTT between theFMCC and DC2 for the following testbed configurations:(i) The RTTs between the FMCC and xTR1 and xTR2 areset to 100 ms and 10 ms, respectively (case 1). (ii) TheRTTs between the FMC controller and xTR1 and xTR2are set to 50 ms and 50 ms, respectively (case 2).

It appears that the downtime duration is proportionalto the RTT between the FMCC and the target DC, sincethe longer it takes to have the information from theDC about the success of a VM migration, the longer ittakes to accordingly inform the xTR routers and thusredirect traffic to xTR2. We draw the same conclusionfor the impact of the RTT between the FMCC and thexTR routers on downtime. Service downtime is mainlyrooted at the LISP mobility management process, andits capacity to rapidly inform the xTR routers about themigration event. The VM size does not have a strongimpact on downtime, since kvm activates the VM inDC2 only after migration is complete, which confirmsthe observations made by Raad et al. [23]. Importantly,the maximum downtime experienced is 100 ms (case 2),which is not expected to have a noticeable impact on ser-vice quality, especially for non-interactive applications.

Another conclusion from our experiments is that theduration of the service migration itself becomes practi-cally independent of the RTT between DCs, as the latterincreases. The time it takes to migrate a VM from DC1to DC2 does not change much as the RTT in the DC1-DC2 path increases beyond 10 ms. This is because VMmigration is based on TCP, which is impacted more bythe link bandwidth (set to 100 Mbps in this experiment)than by the RTT.

8 CONCLUSION

We presented our vision towards the enhanced deliv-ery of cloud-based services to mobile users, tacklingmobility-related challenges and offering an optimizeduser experience. We have designed Follow-Me Cloud, aframework that enables cloud services to “follow” userson the move, by performing sophisticated decisionsto migrate service resources to the appropriate cloudinfrastructure locations. We have defined an analyticmodel for the behavior of our system, and build on it topropose a Markov-Decision-Process-based algorithm forservice migration decisions, which captures the tradeoffbetween migration cost and user experience. We havefurther developed two alternative FMC architecture de-signs, one which is based on SDN technologies andone making use of the LISP protocol. The presentednumerical results from our model, as well as experimentswith our SDN and LISP-based testbeds, demonstrate thepotential of our approach for optimized mobile cloud-based service delivery and its feasibility for real-worlddeployment.

APPENDIXLIST OF ACRONYMS

CTMC Continuous-Time Markov ChainCTMDP Continuous-Time Markov Decision ProcessDTMDP Discrete-Time Markov Decision ProcessEID Endpoint IdentifierETR Egress xTRFMC Follow-Me CloudFMCC FMC ControllerITR Ingress xTRMDP Markov Decision ProcessMR Map ResolverMS Map ServerOF AR OpenFlow Access RouterPGW Packet Data Network GatewayRLOC Routing LocatorSGW Serving GatewayUE User EquipmentxTR Tunneling Router

REFERENCES[1] Cisco, “Cisco Visual Networking Index: Forecast and

Methodology, 2013-2018,” White Paper, June 2014. [Online].Available: http://goo.gl/xoBrTA

[2] T. Taleb, K. Samdanis, and A. Ksentini, “Supporting highly mobileusers in cost-effective decentralized mobile operator networks,”IEEE Trans. Veh. Technol., vol. 63, no. 7, pp. 3381–3396, Sep. 2014.

[3] L. Lei, Z. Zhong, K. Zheng, J. Chen, and H. Meng, “Challenges onwireless heterogeneous networks for mobile cloud computing,”IEEE Wireless Communications, vol. 20, no. 3, pp. 34–44, June 2013.



14

[4] B. Frank, I. Poese, G. Smaragdakis, A. Feldmann, B. Maggs,S. Uhlig, V. Aggarwal, and F. Schneider, “Collaboration Opportu-nities for Content Delivery and Network Infrastructures,” ACMSIGCOMM ebook on Recent Advances in Networking, vol. 1, Aug.2013.

[5] R. Moreno-Vozmediano, R. Montero, and I. Llorente, “IaaS cloudarchitecture: From virtualized datacenters to federated cloud in-frastructures,” IEEE Computer, vol. 45, no. 12, pp. 65–72, Dec. 2012.

[6] 3GPP, “TS 23.002, Network Architecture,” Dec. 2014, release 13.[7] H. Wang, S. Chen, H. Xu, M. Ai, and Y. Shi, “SoftNet: A software

defined decentralized mobile network architecture toward 5G,”Network, IEEE, vol. 29, no. 2, pp. 16–22, Mar. 2015.

[8] T. Taleb, M. Corici, C. Parada, A. Jamakovic, S. Ruffino, G. Kara-giannis, and T. Magedanz, “EASE: EPC as a service to ease mobilecore network deployment over cloud,” Network, IEEE, vol. 29,no. 2, pp. 78–88, Mar. 2015.

[9] T. Taleb and A. Ksentini, “Follow me cloud: interworking fed-erated clouds and distributed mobile networks,” IEEE Network,vol. 27, no. 5, pp. 12–19, 2013.

[10] D. Farinacci, V. Fuller, D. Meyer, and D. Lewis, “Locator/idseparation protocol (lisp),” Working Draft, IETF Secretariat,Internet-Draft draft-ietf-lisp-13, Jun. 2011, http://www.ietf.org/internet-drafts/draft-ietf-lisp-13.txt. [Online]. Available: http://www.ietf.org/internet-drafts/draft-ietf-lisp-13.txt

[11] A. Ksentini, T. Taleb, and F. Messaoudi, “A LISP-Based Implemen-tation of Follow Me Cloud,” IEEE Access, vol. 2, pp. 1340–1347,2014.

[12] A. Ksentini, T. Taleb, and M. Chen, “A markov decision process-based service migration procedure for follow me cloud,” in Proc.IEEE ICC, 2014, pp. 1350–1354.

[13] E. Nordstrom, D. Shue, P. Gopalan, R. Kiefer, M. Arye, S. Y. Ko,J. Rexford, and M. J. Freedman, “Serval: An end-host stack forservice-centric networking,” in Proc. USENIX NSDI, 2012.

[14] A. R. Curtis, J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma,and S. Banerjee, “Devoflow: Scaling flow management for high-performance networks,” in Proc. ACM SIGCOMM, 2011.

[15] R. Bifulco, M. Brunner, R. Canonico, P. Hasselmeyer, and F. Mir,“Scalability of a mobile cloud management system,” in Proc. ACMSIGCOMM Workshop on Mobile Cloud Computing (MCC ’12), 2012.

[16] B. Malet and P. Pietzuch, “Resource allocation across multiplecloud data centres,” in Proc. 8th ACM Int’l Workshop on Middlewarefor Grids, Clouds and e-Science (MGC ’10), 2010, pp. 5:1–5:6.

[17] S. Agarwal, J. Dunagan, N. Jain, S. Saroiu, A. Wolman,and H. Bhogan, “Volley: Automated data placement for geo-distributed cloud services,” in Proc. USENIX NSDI, 2010.

[18] M. Alicherry and T. V. Lakshman, “Network aware resourceallocation in distributed clouds,” in Proc. IEEE INFOCOM, 2012.

[19] M. Steiner, B. G. Gaglianello, V. Gurbani, V. Hilt, W. Roome,M. Scharf, and T. Voith, “Network-aware service placement in adistributed cloud environment,” in Proc. ACM SIGCOMM (demo),2012.

[20] H. Watanabe, T. Ohigashi, T. Kondo, K. Nishimura, and R. Aibara,“A performance improvement method for the global live mi-gration of virtual machine with ip mobility,” in Proc. 5th Int’lConference on Mobile Computing and Ubiquitous Networking (ICMU’10), 2010.

[21] E. Harney, S. Goasguen, J. Martin, M. Murphy, and M. Westall,“The efficacy of live virtual machine migrations over the internet,”in Proc. 2nd Int’l Workshop on Virtualization Technology in DistributedComputing (VTDC ’07), 2007.

[22] Q. Li, J. Huai, J. Li, T. Wo, and M. Wen, “HyperMIP: Hypervisorcontrolled mobile ip for virtual machine live migration acrossnetworks,” in Proc. 11th IEEE High Assurance Systems EngineeringSymposium (HASE 2008), Dec. 2008.

[23] P. Raad, S. Secci, D. P. Chi, A. Cianfrani, P. Gallard, and G. Pujolle,“Achieving sub-second downtimes in large-scale virtual machinemigrations with LISP,” IEEE Trans. Netw. Service Manag., vol. 11,no. 2, pp. 133–143, 2014.

[24] T. Taleb and A. Ksentini, “An analytical model for follow mecloud,” in Proc. IEEE Globecom, Dec. 2013.

[25] “Fon,” https://corp.fon.com/.[26] R. Langar, N. Bouabdallah, and R. Boutaba, “A comprehensive

analysis of mobility management in mpls-based wireless accessnetworks,” IEEE/ACM Trans. Netw., vol. 16, no. 4, pp. 918–931,Aug. 2008.

[27] K.-H. Chiang and N. Shenoy, “A 2-D random-walk mobilitymodel for location-management studies in wireless networks,”IEEE Trans. Veh. Technol., vol. 53, no. 2, pp. 413–424, Mar. 2004.

[28] B. Sikdar, S. Kalyanaraman, and K. S. Vastola, “An integratedmodel for the latency and steady-state throughput of tcp connec-tions,” Perform. Eval., vol. 46, no. 2-3, pp. 139–154, Oct. 2001.

[29] M. L. Puterman, Markov Decision Processes: Discrete StochasticDynamic Programming, 1st ed. New York, NY, USA: John Wiley& Sons, Inc., 1994.

[30] A. Aissioui, A. Ksentini, and A. Gueroui, “Pmipv6-based followme cloud,” in Proc. IEEE Globecom, Dec. 2015.

[31] “NOX OpenFlow Controller,” http://www.noxrepo.orgs.[32] Cisco, “Locator ID Separation Protocol (LISP) VM Mobility Solu-

tion,” Cisco Systems, Inc., Tech. Rep., 2011, White Paper.[33] E. Kohler, R. Morris, B. Chen, J. Jannotti, and M. F. Kaashoek,

“The click modular router,” ACM Trans. Comput. Syst., vol. 18,no. 3, pp. 263–297, Aug. 2000.

[34] J. Martins, M. Ahmed, C. Raiciu, V. Olteanu, M. Honda, R. Bi-fulco, and F. Huici, “ClickOS and the art of network functionvirtualization,” in Proc. USENIX NSDI, 2014.

[35] A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori, “kvm: thelinux virtual machine monitor,” in Proc. Linux Symposium, vol. 1,Ottawa, Ontario, Canada, Jun. 2007, pp. 225–230.

[36] “Fast Data Transfer,” http://monalisa.cern.ch/FDT/.[37] “Markov decision processes (mdp) toolbox,” http://www7.inra.

fr/mia/T/MDPtoolbox/.

Tarik Taleb is currently a Professor at the Schoolof Electrical Engineering, Aalto University, Fin-land. Previously, he was a senior researcherand 3GPP standards expert with NEC EuropeLtd., Heidelberg, Germany (2009-2015). UntilMarch 2009, he was an assistant professor withthe Graduate School of Information Sciences,Tohoku University. From October 2005 to March2006, he was a research fellow with the Intel-ligent Cosmos Research Institute, Sendai. Hereceived his B.E. degree (with distinction) in

information engineering, and M.Sc. and Ph.D. degrees in informationsciences from GSIS, Tohoku University, Sendai, Japan, in 2001, 2003,and 2005, respectively. His research interests include architecturalenhancements to mobile core networks, cloud-based mobile network-ing, mobile multimedia streaming, inter-vehicular communications, andsocial media networking. Prof. Taleb has been also directly engaged inthe development and standardization of the Evolved Packet System asa member of 3GPP’s System Architecture working group.

Adlen Ksentini is currently an Associate Pro-fessor with the University of Rennes 1, France.He is a member of the INRIA Rennes teamDionysos. He received the M.Sc. degree intelecommunication and multimedia networkingfrom the University of Versailles, France, andthe Ph.D. degree in computer science from theUniversity of Cergy-Pontoise, France, in 2005,with a dissertation on QoS provisioning in IEEE802.11-based networks. His other interests in-clude future Internet networks, mobile networks,

QoS, QoE, performance evaluation, and multimedia transmission.

Pantelis A. Frangoudis received his B.Sc.(2003), M.Sc. (2005), and Ph.D. (2012) de-grees in Computer Science from the Departmentof Informatics, AUEB, Greece. Currently, he isa post-doctoral researcher at team Dionysos,IRISA/INRIA Rennes, France, which he joinedunder an ERCIM post-doctoral fellowship (2012-2013). His research interests include wirelessnetworking, Internet multimedia, network secu-rity, future Internet architectures, and QoE mon-itoring and management.

follow-me cloud: when cloud services follow mobile users...either build on the locator-identiﬁer...

Documents