resource optimization across geographically distributed ... · 1.1 virtual machine placement...

RESOURCE OPTIMIZATION ACROSS

GEOGRAPHICALLY DISTRIBUTED

DATACENTERS

by

Siqi Ji

A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science

Edward S. Rogers Sr. Dept. of Electrical and Computer EngineeringUniversity of Toronto

c© Copyright 2017 by Siqi Ji

Resource Optimization Across Geographically

Distributed Datacenters

Master of Applied Science ThesisEdward S. Rogers Sr. Dept. of Electrical and Computer Engineering

University of Toronto

by Siqi Ji2017

Abstract

Cloud computing provides users and enterprises shared pools of resources to store and

process their data. To provide quality-of-service guarantees for customers and reduce re-

source wastage, how to manage resources becomes a crucial problem for cloud providers.

In this thesis, we propose and implement different approaches for resource optimization

with various objectives. Resource optimization algorithms used by cloud providers have

a significant impact on the performance of virtual machines (VMs) that users rent for

computation as well as the ability for datacenters to accommodate user requests. We pro-

pose a multi-dimensional online VM placement algorithm that can balance the usage of

resources along multiple dimensions and improve VM performance effectively. There are

also some applications like geo-replication that need to transfer data across datacenters

within a time period. We propose and implement an efficient solution that maximizes

throughput for multiple concurrent inter-datacenter multicast transfers while meeting

their deadlines.

ii

TO MY PARENTS

iii

Acknowledgments

First, and most importantly, I would like to express my deepest appreciation to my

thesis supervisor, Professor Baochun Li, for the continuous support of my master study

and research at University of Toronto. I benefited a lot from his immense knowledge,

sharp visions, and scientific insights. He always gives me valuable advice not only for my

research but also for my career.

I also would like to thank my examination committee: Professor Ben Liang, Profes-

sor Shahrokh Valaee, and Professor Cristiana Amza, for their insightful comments and

advice.

Third, I would like to thank all the group members in iQua research group: Xu Yuan,

Jun Li, Zhiming Hu, Liyao Xiang, Li Chen, Shuhao Liu, Wenxin Li, Yinan Liu, Hao

Wang and Wanyu Lin. They are like my family and always there to help me out. I

learned a lot from them. Also, they make my master life in U of T more fun and more

fulfilling.

Last but not the least, I want to thank my family — my father Youhui Ji, my mother

Xuexiang Wang and my fiance Puwen Chen. I can not do this without their support,

understanding, and love. They never give up on me and always encourage me to do my

best. There are not enough words to express my love for them.

iv

Contents

Abstract ii

Acknowledgments iv

Contents v

List of Tables viii

List of Figures ix

1 Introduction 1

1.1 Virtual Machine Placement . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Inter-Datacenter Multicast Transfers with Deadlines . . . . . . . . . . . . 5

1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Related Work 9

2.1 Virtual Machine Placement . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Inter-Datacenter Multicast Transfers with Deadlines . . . . . . . . . . . . 11

3 An Online Virtual Machine Placement Algorithm

in an Over-Committed Cloud 14

v

CONTENTS CONTENTS

3.1 Motivation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Min-DIFF: An Online VM Placement Algorithm . . . . . . . . . . . . . . 16

3.2.1 Resource Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.2 Find the Best PM for Single VM Request . . . . . . . . . . . . . 19

3.2.3 VM Selection for Multiple VM Requests . . . . . . . . . . . . . . 23

3.2.4 Details of Min-DIFF . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.1 Architecture of the Simulator . . . . . . . . . . . . . . . . . . . . 27

3.3.2 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.3 Simulation Results: Threshold 100% . . . . . . . . . . . . . . . . 30

3.3.4 Simulation Results: Threshold is Smaller than 100% . . . . . . . . 33

3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

4 Deadline-Aware Scheduling and Routing for Inter-Datacenter Multicast

Transfers 39

4.1 Motivation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . . . 42

4.2.1 Finding Feasible Steiner Trees . . . . . . . . . . . . . . . . . . . . 43

4.2.2 Linear Program Formulation . . . . . . . . . . . . . . . . . . . . . 43

4.2.3 Choose Sparse Solutions . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.4 Proof of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.2.5 An Example of the Optimal Solution . . . . . . . . . . . . . . . . 50

4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

vi

CONTENTS CONTENTS

4.4.2 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . 54

4.4.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5 Conclusion 62

Bibliography 64

vii

List of Tables

3.1 Variables used in this chapter. . . . . . . . . . . . . . . . . . . . . . . . . 17

3.2 Amazon EC2 VM instances used in the first dataset. . . . . . . . . . . . 29

3.3 Resource requirements used in the second and third datasets. . . . . . . . 29

4.1 Request requirements for the motivation example. . . . . . . . . . . . . . 41

4.2 Variables used in the chapter. . . . . . . . . . . . . . . . . . . . . . . . . 44

4.3 Request requirements for the example. . . . . . . . . . . . . . . . . . . . 51

4.4 Comparison of the sparse routing approach and original linear program,

the left side represents the number of trees for sparse solution, the right

side represents the number of trees for original linear program. . . . . . . 56

viii

List of Figures

3.1 A motivation example of VM placement. . . . . . . . . . . . . . . . . . . 16

3.2 A sketch of the threshold-based idea. . . . . . . . . . . . . . . . . . . . . 18

3.3 An illustrative example of resource fragmentation. . . . . . . . . . . . . . 20

3.4 Architecture of the simulator. . . . . . . . . . . . . . . . . . . . . . . . . 27

3.5 Number of used PMs for different datasets in the single request scenario:

(a) Amazon EC2. (b) Uniform distribution. (c) Normal distribution. . . 31

3.6 The average resource fragmentation of all used PMs for different datasets

in the single request scenario: (a) Amazon EC2. (b) Uniform distribution.

(c) Normal distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.7 Number of used PMs for different datasets in the multi-request scenario:

(a) Amazon EC2. (b) Uniform distribution. (c) Normal distribution. . . 32

3.8 The average resource fragmentation of all used PMs for different datasets

in the multi-request scenario: (a) Amazon EC2. (b) Uniform distribution.

(c) Normal distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.9 Results of the homogeneous setting: (a) Number of used PMs. (b) The

average resource fragmentation of all used PMs. . . . . . . . . . . . . . . 33

ix

LIST OF FIGURES LIST OF FIGURES

3.10 Results of the real-world workload trace: (a) Number of PMs. (b) The

average resource fragmentation of all used PMs. . . . . . . . . . . . . . . 34

3.11 Comparison results of the light load scenario: (a) Number of PMs. (b)

The average resource fragmentation of all used PMs. . . . . . . . . . . . 35

3.12 Light load scenario: the percentage of PMs that resource utilization is

higher than 80% (Over-committed resources are included): (a) CPU. (b)

Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.13 Comparison results of the heavy load scenario: (a) Number of PMs. (b)

The average resource fragmentation of all used PMs. . . . . . . . . . . . 37

3.14 Comparison results of the number of failures. . . . . . . . . . . . . . . . . 38

4.1 A motivation example: (a) Finding paths from the source to each destina-

tion, request R2 will miss its deadline. (b) Using Steiner trees for transfers,

both R1 and R2 can complete before deadlines. . . . . . . . . . . . . . . 42

4.2 An example of the optimal solution obtained by solving the linear program

in Sec. 4.2.3 for maximizing the total throughput of all requests. . . . . . 50

4.3 Architecture of the application-layer SDN design. . . . . . . . . . . . . . 52

4.4 The 6 Google Cloud datacenters used in our deployment and experiments. 54

4.5 Completion time deviation. . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.6 Comparison of different solutions for early deadline requests. . . . . . . . 57

4.7 Comparison of different solutions as the number of destinations increases. 58

4.8 Throughput comparison of different solutions. . . . . . . . . . . . . . . . 59

4.9 The computation time of our approach. . . . . . . . . . . . . . . . . . . . 60

x

Chapter 1

Introduction

In the era of big data analytics, as the volume of data grows exponentially, the need

for data processing becomes more pressing than ever before. Cloud computing enables

cheap and easy access to shared pools of computational resources, which provides users

and enterprises with capabilities to store and process their data efficiently and reliably.

Different kinds of applications can be hosted in datacenters by renting virtual ma-

chines. These applications are characterized by diverse resource requirements across

multiple dimensions such as memory, CPU cores, storage space, and network bandwidth.

Due to the variety of resource requirements, resource management involves some non-

trivial challenges. One of the challenges is how to deploy virtual machines (VMs) that

requested by applications in datacenters. Improper VM packing schemes can cause re-

source wastage and VM performance degradation. Applications running in a datacenter

rely heavily on the performance of VMs. Thus it is significant to guarantee each VM

request gets its fair share of resources.

Another challenge is how to take full advantage of available inter-datacenter resources

to meet more customer requirements. To increase availability and reduce latency for end

1

1.1. VIRTUAL MACHINE PLACEMENT 2

users, large companies or cloud providers are deploying from tens to hundreds of datacen-

ters around the world in a geographical fashion. Many applications like geo-replication

need to deliver multiple copies of data from a single datacenter to multiple datacenters,

which has benefits of improving fault tolerance, increasing availability and achieving high

service quality. These applications usually require completing multicast transfers before

certain deadlines. Due to the limitation of bandwidth between datacenters, how to al-

locate and schedule bandwidth for inter-datacenter transfers with the goal of meeting

customer requirements becomes a crucial issue.

Therefore, in this thesis, we present our solutions concerning these two challenges in

resource optimization. In the following sections, for both of these two problems, we give

a brief overview of the background, the limitations of existing solutions, and present our

contributions. Finally, the organization of this thesis is illustrated.

1.1 Virtual Machine Placement

Virtualization [1] is an important technology of cloud computing, it separates physical

hardware resources to create virtual machines (VMs) with dedicated resources. Virtual

machine (VM) acts like a real computer with an operating system. Through virtualiza-

tion, it is possible to run multiple VMs on the same physical machine (PM) at the same

time while increasing efficiency and utilization of hardware resources.

VM placement is the process of selecting the most appropriate PM for deploying

VMs. Since different users have diverse resource requirements for VMs and resources on

each PM are limited, improper VM placement can cause unbalanced resource utilization

(overloaded in some resources but underutilized in others). Such resource fragmentation

requires extra PMs and wastes more resources. Therefore, it is crucial to balance PM


resources along multiple dimensions (i.e., CPU, memory, storage, network bandwidth,

etc.) during VM placement and minimize the number of activated PMs.

Existing works [2–4] indicated that VMs tend to utilize less resources than reserved

capacities, which causes substantial resource wastage. Resource Overcommitment

is widely used for solving this wastage problem by allocating more resources to VMs

than they actually have. For example, a PM has 64GB memory and it is sold as 128GB

memory, we say that the overcommit ratio of this PM is 2. Commercial cloud man-

agement products such as VMware ESX Server extensively incorporates resource over-

commitment [5]. It could be problematic if we only consider the problem of minimizing

the number of activated PMs and reduce resource fragmentation in the over-committed

cloud. While resource overcommitment increases resource utilization and benefits the

cloud service providers, the risk of provider-induced overload is also increased. This hap-

pens when users actually demand enough of resources such that collectively they exhaust

all available physical resources, which leads to VM performance degradation and could

drive off users. As such, it is of interest to consider resource overcommitment during VM

placement, such that the risk of service degradation is reduced while resource utilization

is enhanced.

There has been a significant amount of work on VM placement. Some existing

works [6–9] only considered one resource dimension when they deploy VM placement,

these approaches overlooked the multi-dimensional nature of the problem and did not

balance resources along different dimensions. Other works solved the problem in an of-

fline optimization manner [10–14]: resource requirements are known in advance. First-fit

Decreasing is a well-known heuristic, which first sorts the size of VMs. These approaches

did not adequately reflect the true nature of VM requests, with unpredictable arrivals


and departures. Other approaches like [15, 16] studied the VM placement problem with

the goal of achieving balanced resource utilization along multiple dimensions and mini-

mizing the number of physical machines activated, which has very close objectives with

our work. However, they did not raise the PM overloading problem in an over-committed

cloud as their concern. The overcommitment issue is typically considered in VM migra-

tion [2, 3] while there were very few works that took this problem into account in the

process of VM placement.

For the challenge of deploying VMs, we solve the VM placement problem with the

objective of balancing the use of resources along multiple dimensions and reducing the

risk of PM overloading while considering the over-committed issue. Our contributions

are the following:

First, we consider multiple dimensions of resources in VM placement, and both homo-

geneous and heterogeneous settings regarding PM resource capacity configurations are

generated in our simulations, which realistically resembles modern cloud datacenters.

Second, we consider the model that VM deployment requests arrive and depart dy-

namically, in an online manner where resource requirements are not known beforehand.

This dynamic nature produces a realistic scenario.

Third, we propose a threshold-based algorithm called Min-DIFF, which considers

resource overcommitment in an effort to reduce the risk of service degradation. Besides,

Min-DIFF obtains a more balanced use of resources along different dimensions than

related works, which reduces resource fragmentation effectively.

Finally, our simulations are driven by both real-world traces and datasets we gener-

ated, to evaluate the effectiveness of Min-DIFF under a wider spectrum of conditions.

Our simulation results show that Min-DIFF has better performance in three aspects:

1.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 5

First, Min-DIFF uses fewer PMs and achieves lower resource fragmentation than other

approaches if we do not take the overcommitment issue into account. Second, Min-DIFF

has a lower risk of PM overloading compared to other approaches in an over-committed

cloud. Third, Min-DIFF has a greater ability in accommodating VM requests than re-

lated works.

1.2 Inter-Datacenter Multicast Transfers with Dead-

lines

To improve the fault tolerance, increase availability and achieve high service quality, many

applications require efficient data transfers from one datacenter to multiple datacenters,

typically for data replication, database synchronization, and data backup. For example,

search engines need to synchronize databases regularly for the purpose of achieving a

higher quality of user experience [17]. Blocks of a file in many distributed file systems

like HDFS are replicated for fault tolerance.

Inter-datacenter transfers can roughly be classified into three categories based on their

delay tolerance: interactive transfers, elastic transfers and background transfers [18].

Interactive transfers, like video streams and web requests, are highly sensitive to loss and

delay, so they should be delivered instantly with strictly higher priority. Elastic transfers

are delay tolerant but still require timely delivery (before a deadline). For example, many

applications need to back up data every certain time. Background transfers such as data

warehousing does not have explicit deadlines.

Why do we need to consider transfer deadlines? When multiple inter-

datacenter transfers are sharing the same links in the inter-datacenter network, the total


demand for these transfers typically far exceeds the available network capacity. On the

one hand, some transfers like elastic transfers need to be completed timely, which can be

modeled as deadlines. On the other hand, cloud providers set deadlines for most transfers

based on their delay tolerance to different customer service level agreements (SLAs). A

survey of WAN customers at Microsoft [19] shows that most transfers require deadlines

and it would incur penalty if deadlines are missed. Customers are willing to pay more

for guaranteed deadlines. Therefore, it is an important topic to meet transfer deadlines

as many as possible.

For resource optimization in the inter-datacenter network, we focus on elastic trans-

fers and background transfers that deliver data from one datacenters to multiple dat-

acenters. We propose an efficient solution to maximize the network throughput and

consider transfer deadlines at the same time. The multicast (one-to-many) transfer type

is quite representative, other transmission types like unicast (one-to-one) and broadcast

(one-to-all) can be transformed into it.

Traditional wisdom used Steiner Tree Packing [20,21] to maximize the flow rate from

a source to multiple destinations, which is a NP-complete problem. Another approach is

to treat the multicast transmission as multiple unicast transfers. Existing solutions like

B4 [22], SWAN [18] and BwE [23] aimed to maximize utilization and focus on max-min

fairness. Tempus [24] designed a strategy to maximize the minimum fraction of transfers

finished before deadlines. Amoeba [25] guaranteed deadlines, it introduced a deadline-

based network abstraction for inter-datacenter transfers. DCRoute [17] scheduled each

transfer with a single path to avoid packet reordering and it also guaranteed transfer

deadlines for admitted requests.

Unfortunately, these solutions were not explicitly designed for multicast transfers,


which can actually waste bandwidth by finding paths from the source to each destination,

result in rejecting more transfer requests with deadline requirements. DCCast [26] and

DDCCast [27] proposed to use minimum weight Steiner Trees and DDCCast used As

Late As Possible (ALAP) policy for rate allocation. However, DCCast and DDCCast

used one minimum weight Steiner tree for a request, which reduced the flexibility of

choosing routing paths. Besides, if the bandwidth required by a request with a specific

deadline is higher than the maximum available bandwidth in the network, the request

will be rejected by only choosing one tree. However, if we can split the traffic at the

source and use multiple trees for delivering data, then the request can meet its deadline

with higher throughput. Moreover, in the admission control part of DDCCast, a request

can be rejected although it could have been admitted by choosing other forwarding trees.

DDCCast and DDCast did not aim to achieve maximizing throughput.

To tackle the problem, we design a new routing and scheduling algorithm for multiple

multicast data transfers across geo-distributed datacenters to maximize network through-

put, with the consideration of transfer deadlines. We have implemented our solution in

an application-layer software-defined inter-datacenter network. We also evaluate the per-

formance of our solution with real-world experiments in the Google Cloud Platform. Our

contributions are the following:

First, prior works on inter-datacenter traffic engineering [17, 18, 22–25] focused on

unicast transfers, which are not effective for multicast transfers. We propose to use

Steiner trees for each multicast transfer.

Second, prior work on multicast inter-datacenter transfers [26, 27] used one tree for

each transfer, which could reject some transfer requests that have early deadlines. Our

solution has higher flexibility for routing and uses at least one tree for each transfer.

1.3. THESIS ORGANIZATION 8

We formulate the problem as a Linear Program (LP), which can pack multiple multicast

transfers with deadlines efficiently and achieve high throughput. Besides, to reduce packet

reordering overhead at the destination, we add a penalty function in the objective of LP

and use a log-based heuristic [28, 29] to find sparse solutions.

Third, prior work on multicast techniques [30–34] used software-defined networking

(SDN) at the network layer. However, hardware switches in each datacenter can only

support a limited number of forwarding entries. Besides, it is complicated and costly

to solve the flow table scalability problem at large scales. We have implemented our

solution in an application-layer software-defined networking (SDN), which does not need

to modify the underlying network properties and can scale up to a large number of

transfer requests.

Fourth, our real-world experiment results over Google Cloud Platform have shown

that our solution performs higher throughput and accommodates more transfer requests

with deadlines as compared with the existing related works that consider deadlines.

1.3 Thesis Organization

The remainder of this thesis is organized as follows. In Chapter 2, we discuss the re-

lated works regarding these two problems. To balance the usage of resources across

multiple dimensions and reduce the risk of PM overloading, in Chapter 3 we propose

a threshold-based online VM placement algorithm. In Chapter 4, we consider the re-

source optimization challenge across different datacenters. We propose to use multiple

Steiner trees for multicast transfers with the purpose of maximizing network throughput

and meeting transfer deadlines as many as possible. Finally, we summarize our work in

Chapter 5.

Chapter 2

Related Work

Problems of resource optimization across datacenters have been extensively studied in

the field of cloud computing. In this chapter, we first present related works about deploy-

ing VM requests in datacenters and then talk about existing works related to resource

optimization for inter-datacenter transfers.

2.1 Virtual Machine Placement

VM placement can be formulated as a bin packing problem which is proved to be NP-

hard [35]. Many heuristics have been proposed to solve this issue. A widely used approach

is the First Fit heuristic, which allocates each VM request to the first available PM. The

limitation of First Fit is that resources on PMs may be imbalanced. Some approaches

mainly focused on one resource type as the allocation criterion. Min-Min and Max-Min

are two well-known heuristics which assign a VM request based on the CPU capacity.

Monil et al. [6] proposed a multi-pass Best Fit Decreasing VM placement algorithm

that achieved a balance between energy consumption and quality of service. It did

9


not consider multiple dimensions of resources, rather focusing on CPU utilization only.

Wang et al. [7] formulated the VM placement problem into a Stochastic Bin Packing

problem, which used random variables to characterize the uncertain future bandwidth

usage for each VM. However, within their probabilistic characterization, only bandwidth

was considered. Zhang et al. [8] formulated the problem into a constrained minimum

k-cut problem, under the constraint of VM performance. Nevertheless, the solution was

based on energy consumption as a singular criterion, instead of the specific PM resources.

These approaches did not consider to balance the use of multiple resources, which could

make one resource in high utilization but other resources under-utilized.

Other approaches solved the VM placement problem in an offline manner, they as-

sumed that the VM requests are known beforehand. Beloglazov et al. [14] modified the

Best Fit Decreasing algorithm for deploying VMs, it sorted all VM requests in a decreas-

ing order of their CPU requirements. Most related works like [9, 36–40] formulated the

VM placement problem as an optimization problem which is not suitable for dynamic VM

requests. In detail, Xu et al. [36] formulated the static VM placement scenario as a multi-

objective optimization problem and expanded on a two-level control genetic algorithm to

achieve high scalability and robustness. Adamuthe et al. [37] also solved a multi-objective

optimization problem with maximizing profit, maximizing load balance and minimizing

resource wastage. Yanagisawa et al. [9] presented a mixed integer programming approach

for the optimal placement of VMs in respect of minimizing PM resources while guaran-

teeing fault-tolerance. Besides, it mainly focused on CPU resources. OVMP [39] used

the optimal solution of stochastic integer programming (SIP) to minimize the cost for

hosting virtual machines in multiple cloud providers while considering future demand

and price uncertainty. Rampersaud et al. [10] designed an approximation algorithm that


took multi-dimensional resources into account for maximizing profit derived from hosting

VMs. However, in real scenarios, requests are coming at different time slots, and it is

hard to know VM requests in advance.

There are a lot of related works that proposed approaches for the online VM place-

ment [15, 16, 41–44] with the consideration of multi-dimensional resources, but few con-

sidered the overcommitment issue in VM placement. Alicherry et al. [44] optimized

data access latencies by using an intelligent VM placement algorithm. Dong [42] com-

bined minimum cut with best-fit to design a novel greedy algorithm that reduced the

number of activated physical servers and network elements to achieve the energy-saving

goal. Mishara et al. [43] proposed a methodology for dynamic VM placement based on

vector arithmetic. Max-BRU [15] considered multiple resource types, focused on maxi-

mizing resource utilization and meanwhile balanced the use of multiple types of resources.

EAGLE [16] proposed a multi-dimensional space partition model to balance resource uti-

lization along different dimensions while minimizing total energy consumed by running

PMs. Overcommitment is mainly considered in VM migration [2, 3]. In this thesis, we

use a threshold-based idea to efficiently reduce the risk of PM overloading caused by

overcommitment, which also reduces overhead in migration.

2.2 Inter-Datacenter Multicast Transfers with Dead-

lines

There are a lot of related works on datacenter traffic engineering or deadline-aware rout-

ing. Video streaming is one type of multicast transfers, which needs to deliver the video

content from a single source to other users in remote regions. This kind of transfer is


highly delay-sensitive. Celerity [45] packed only depth-1 and depth-2 trees, Airlift [46]

maximized throughput without violating end-to-end delay constraints by using network

coding and Liu et al. [47] proposed a delay-optimized routing scheme by only solving

linear programs. However, these works were explicitly designed for delay-sensitive video

streaming. We focus on elastic and background transfers which are delay-tolerant and

some of them have deadlines.

Some existing works focused on improving performance for bulk transfers. Laoutaris et

al. [48] proposed NetStitcher which minimized the completion time of bulk transfers by

stitching unutilized bandwidth and employing a store-and-forward algorithm. They ex-

tended their work in [49] by considering the time-zone difference for delay-tolerant bulk

transfers. Chen et al. [50] considered bulk transfers with deadlines in grid networks.

Store-and-forward is also used in [51, 52] to complete transfers. Wang et al. [51] aimed

to minimize network congestion of deadline-constrained bulk transfers. Wu et al. [52]

concentrated on a per-chunk routing scheme. Storing data at the intermediate datacen-

ters will increase the storage cost and transfer overhead. Owan [53] jointly optimized

bulk transfers in optical and network layers. These works only considered unicast bulk

transfers.

Google B4 [22] and Microsoft SWAN [18] used SDN among inter-datacenters for

traffic engineering to maximize network throughput. BwE [23] provided work-conserving

bandwidth allocation and focused on max-min fairness. These works did not consider

transfer deadlines. Tempus [24] proposed an online scheduling scheme to maximize the

minimum fraction of inter-datacenter transfers finished before deadlines. Ameoba [25]

and DCRoute [17] guaranteed deadlines for admitted requests but they were not explicitly

designed for multicast transfers.


DCCast [26] chose minimum weight forwarding trees for transfers and it focused on

multicast transfers. DDCCast [27] was based on DCCast which took transfer deadlines

into consideration. Our work differs from DDCCast because our solution chooses at least

one tree for each transfer, which can accommodate more transfer requests than using

exactly one tree and achieve higher throughput.

Chapter 3

An Online Virtual Machine

Placement Algorithm

in an Over-Committed Cloud

Public cloud providers such as Amazon EC2 and Google Cloud Platform provide users

with a massive pool of PMs to create different types of VMs for storing and processing

data. In the cloud computing, after users submit their VM requests, VM placement is

conducted to select the most suitable PM to host each VM. The performance varies for

different VM placement schemes. Most of the existing works only consider maximizing

the resource utilization of PMs without taking the overcommitment issue into account,

which can cause PM overloading and degrade VM performance. In this chapter, we

propose an algorithm, called Min-DIFF, that can balance the usage of resources along

multiple dimensions and reduce the risk of PM overloading effectively.

This chapter is organized as follows. In Sec. 3.1, we motivate our work by using

a simple example. We discuss how to deploy VM requests and illustrate details of our

14

3.1. MOTIVATION EXAMPLE 15

algorithm Min-DIFF in Sec. 3.2. In Sec. 3.3, we explain the architecture of our simulator,

present our simulation setup and show the simulation results by using different datasets.

Finally, we conclude the chapter in Sec. 3.4.

3.1 Motivation Example

Previous works on VM placement try to maximize the utilization of PMs and pack VMs

as tightly as possible. However, resource overcommitment may cause PM overloading

when total resources utilized by VMs do exceed the PM’s actual capacities. We use an

example to illustrate PM overloading, and we only consider memory in this example for

simplification. As shown in Figure 3.1(a), considering there are three VM requests: VM1,

VM2, and VM3, each requires memory of 32GB, 24GB, and 16GB, respectively. The

memory capacity of the PM is 36GB, and it is sold as 72GB with the overcommit ratio of

2. If we pack all VMs in this PM and these VMs utilize 60% of their required resources,

then this PM will be overloaded. Overloading can substantially degrade VM performance,

and some VMs will not get their fair share of resources. A better approach is to set an

80% threshold of 72GB memory, total resources of VMs placed in the PM can not exceed

this threshold. As we can see from Figure 3.1(b), only VM1 and VM2 are placed in this

PM, VM3 will be placed in another PM. In this way, resources utilized by VMs will not

exceed the PM’s capacities, all VMs can work well and get good performance. Related

works only consider the overcommitment issue in VM migration. Nevertheless, migrating

VMs in overloaded PMs can cause extra overhead and increase bandwidth usage. We

save overhead and network bandwidth by considering overcommitment in the VM initial

placement. We will discuss the setting of the threshold later.

3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 16

Actual resourcesVM1

VM2

VM3

(a) Packing VMs as tightly as possible.

Actual resourcesVM1

VM2

80%threshold

(b) Setting a 80% threshold for the over-committed PM.

Figure 3.1: A motivation example of VM placement.

3.2 Min-DIFF: An Online VM Placement Algorithm

In this section, we present our proposed threshold-based online VM placement algorithm

Min-DIFF, which can reduce resource fragmentation efficiently with the consideration of

the overcommitment issue.

Figure 3.2 gives a sketch of the threshold-based idea. Grey squares represent different

VMs. We deploy VMs by using one of the two strategies shown in Figure 3.2. In Strategy

1, we select the most appropriate PM to place VMs under the threshold. If we can not

find space under the threshold, then we use Strategy 2 to place VMs without considering

the threshold. Strategy 1 always has the highest priority. Table 3.1 presents variables

we used in this chapter and their definitions. A VM request i can be denoted as a tuple

{ai, dui,VMdi }.


Variables MeaningD The number of resource dimensions.d The index of resource dimensions, d = 1, ...,D.j The index of PMs.i The index of VM requests.Udj Used resource along dimension d of the jth PM.

PMdj Total resource along dimension d of the jth PM.

VMdi Resource requirement along dimension d of the ith VM re-

quest.ai Arrival time of the ith VM request.dui The duration of the ith VM request.wd The warning line of PMs along dimension d.Ld The largest VM resource requirement along dimension d.Thdj The threshold along dimension d of the jth PM.

RFj Resource fragmentation of the jth PM.NRd

j The normalized residual resource along dimension d of the jthPM.

NUdj The normalized used resource along dimension d of the jth

PM.

Table 3.1: Variables used in this chapter.


Strategy 1: Place VMs below the threshold.

Strategy 2: Place VMs without considering the threshold.

threshold

threshold

PM1 PM2 PM3 PM1 PM2 PM3

PM1 PM2 PM3 PM1 PM2 PM3

Figure 3.2: A sketch of the threshold-based idea.

3.2.1 Resource Threshold

Typically, to guarantee performance for most VMs and reduce the risk of PM overloading,

some providers do not expect the utilization of over-committed PMs is higher than a

specific percentage [54], and we call this percentage a warning line in this chapter.

For example, if a PM is sold as 72GB memory with the overcommit ratio of 2 and the

warning line of memory is 80%, then the total memory of VMs in this PM should not

exceed 57.6GB. The warning line is considered when we set the resource threshold in

Min-DIFF.

On the other hand, to reduce resource fragmentation, we reserve enough space for

large VMs above the threshold. Otherwise, if we can not find enough space below the

threshold and need to use Strategy 2, a large VM can not be placed in the PM, which

will cause large resource fragmentation. Therefore, based on the warning line wd and the


largest VM requirement Ld, the threshold Thdj is defined as:

Thdj= min

{PMd

j − Ld

PMdj

, wd

}. (3.1)

3.2.2 Find the Best PM for Single VM Request

If there is one VM request at each time slot, we find the best PM for this request. We

will illustrate how to choose the best PM in this section.

In order to place more future VM requests, we try to balance resources along multiple

dimensions left on each PM. Otherwise, if residual resources along one dimension become

unavailable, resources along other dimensions are wasted. Such resource fragmentation

will prevent future VM requests and waste resources.

As shown in Figure 3.3, we use a simple example to illustrate the concept of resource

fragmentation. The biggest rectangle represents the total CPU and memory capacities

of a PM. Three VMs are deployed in this PM. The three small rectangles denote the

amount of CPU and memory allocated to each VM. Once a VM is placed in the PM, the

available resource capacity is reduced along each dimension. In this example, we can see

that the PM has a lot of available CPU resource but very little unused memory, which

prevents further VM requests placed in this PM because of the lack of enough memory.

Considering a datacenter provides a pool of resources such as CPU, memory, network

bandwidth and storage, we deal with multiple resource types in VM placement. Inspired

by previous works [36, 55, 56], we extend the resource wastage model to multiple dimen-

sions, which is not specific to two dimensions. The following equation is used to measure


VM1

VM2

VM3

CPU

Memory

Residual resources

Figure 3.3: An illustrative example of resource fragmentation.

the resource fragmentation of a PM:

RFj =

∑p,p 6=m

(NRp

j −NRmj

)D∑d=1

NUdj

, (3.2)

where RFj represents the resource fragmentation of the jth PM. NRmj indicates the

smallest normalized residual resource. NRpj denotes the normalized residual resource

along dimension p, and p does not equal to m. Therefore the numerator calculates the

sum of differences between the smallest normalized residual resource and the others. The

denominator represents the sum of the normalized used resource along each dimension.

When computing the resource fragmentation, the residual resource as well as the used

resource on the PM is normalized by the PM’s overall capacity. It is evident that the

more used resource and more balanced residual resource along different dimensions, the

resource fragmentation value is smaller.


To reduce resource fragmentation and pack VMs tightly, we propose an efficient al-

gorithm based on the resource fragmentation Equation (3.2). Intuitively, an idea is to

choose the PM that has the minimal resource fragmentation value after a VM is placed.

However, this idea is problematic for utilized PMs. For example, considering there are

two utilized PMs and a VM needs to be placed. If we place the VM in PM1, RF1 = 0.3;

if we place the VM in PM2, RF2 = 0.2, then PM2 is selected. Nevertheless, this idea

does not consider the resource fragmentation value before the VM is placed. if RF1 = 0.6

and RF2 = 0.1 before the VM is placed, deploying the VM in PM2 actually makes the

resource fragmentation value higher.

A proper approach is: for non-empty PMs, we deploy a VM in the PM that has the

largest resource fragmentation reduction. Before a VM is placed, the normalized used

resource is:

NUdj bef =

Udj

PMdj

, d = 1, ...,D. (3.3)

The normalized residual resource is:

NRdj bef = 1− NUd

j bef, d = 1, ...,D. (3.4)

The smallest value among NRdj bef is NRm

j bef , then the initial resource fragmentation

is:

RFj bef =

∑p,p 6=m

(NRp

j bef −NRmj bef

)D∑d=1

NUdj bef

. (3.5)

Similarly, after a VM is placed, the normalized used resource is:

NUdj aft =

Udj + VMd

j

PMdj

, d = 1, ...,D. (3.6)


The normalized residual resource is:

NRdj aft = 1− NUd

j aft, d = 1, ...,D. (3.7)

We still find the smallest value among NRdj aft, which is denoted as NRm

j aft. Therefore

the resource fragmentation value after deploying a VM is:

RFj aft =

∑p,p 6=m

(NRp

j aft−NRmj aft

)D∑d=1

NUdj aft

. (3.8)

Then we calculate the difference of resource fragmentation before a VM is placed and

resource fragmentation after a VM is placed, which is:

δRFj= RFj bef −RFj aft. (3.9)

For non-empty PMs, we choose the PM that has the largest δRFj. For empty PMs, we

select the PM with the most balanced utilization along resource dimensions. Thus, we

calculate the differences between the smallest normalized residual resource NRmj and the

others NRpj , choose the PM that has the smallest RFj empty.

RFj empty =∑p,p 6=m

(NRp

j −NRmj

). (3.10)

To pack VMs tightly, we first find the most appropriate PM among non-empty PMs,

if there is no available utilized PM, we choose the best PM among empty PMs.


3.2.3 VM Selection for Multiple VM Requests

If there are multiple VM requests at each time slot, we do the placement as follows.

When resources are available on a PM, we choose the set of VM requests at the current

time slot whose resource requirements can be accommodated on that PM. If the PM is

utilized, we compute δRFjto the PM for each VM request in this set. The request with

the largest δRFjwill be placed in that PM. If the PM is empty, we compute RFj empty

and choose the VM with the smallest value to place in that PM. This process is repeated

recursively until the PM can not accommodate any VM requests in the current time slot.

Then we go to the next PM to place other VM requests.

3.2.4 Details of Min-DIFF

Min-DIFF is illustrated by Algorithm 3.1. First of all, we calculate the threshold for

each PM based on Equation (3.1) (line 2-4). If there are multiple requests at the time

slot, we use Strategy 1 to place current VMs below the threshold by calling the func-

tion PlaceCurVMsBlwTh(current V Ms), which is presented by Algorithm 3.2. If

there is not enough space for all current VMs, we use Strategy 2 where function Place-

CurVMs(current V Ms) is called. PlaceCurVms(current V Ms) is similar to Place-

CurVMsBlwTh(current V Ms), the difference is line 9, it judges whether the PM is

able to accommodate current unplaced VMs (does not consider the threshold).

If there is only one VM request at the time slot, we choose the best PM for this VM.

Function FindBestPM(v, PMs) uses Strategy 2, which is similar to Strategy 1: Find-

BestPMBlwTh(v, PMs) (Algorithm 3.3). The difference between these two functions

is line 12, for FindBestPM(v, PMs), it judges whether there is enough space in a PM

for the VM request.


Algorithm 3.1 Min-DIFF algorithm

1: function VMplacement(VMs, PMs)2: for m in PMs do:3: calculate threshold based on Equation (3.1)4: end for5: while VMs 6= ∅ do6: current VMs = requests at the current time slot7: remove current VMs from VMs8: if length(current V Ms) > 1 then:9: flag, current V Ms10: =PlaceCurVMsBlwTh(current V Ms)11: if flag is False then:12: PlaceCurVms(current VMs)13: end if14: else if length(current V Ms) = 1 then:15: for v in current VMs do16: BestPM = FindBestPMBlwTh(v, PMs)17: if BestPM is not None then:18: Place VM v on BestPM19: Remove VM v from current VMs20: continue21: end if22: BestPM = FindBestPM(v, PMs)23: if BestPM is not None then:24: Place VM v on BestPM25: Remove VM v from current VMs26: end if27: end for28: end if29: end while30: end function


Algorithm 3.2 Place multiple current VM requests below the threshold

1: function PlaceCurVMsBlwTh(current V Ms)2: for m in PMs do:3: while True do:4: score used = −∞5: score empty = +∞6: placed = False7: selected VM = 08: for v in current VMs do:9: if v can be placed below the threshold then:10: placed = True11: if m is utilized then:12: calculate δRFj

based on Equation (3.9)13: if δRFj

> score used then:14: score used = δRFj

15: selected VM = v16: end if17: else:18: calculate RFj empty based on Equation (3.10)19: if RFj empty < score empty then:20: score empty = RFj empty21: selected VM = v22: end if23: end if24: end if25: end for26: if placed then:27: Place VM v in PM m28: remove VM v from current VMs29: else:30: break31: end if32: end while33: end for34: if length(current VMs)=0 then:35: return True, current VMs36: else:37: return False, current VMs38: end if39: end function


Algorithm 3.3 Find the best PM below the threshold

1: function FindBestPMBlwTh(v, PMs)2: score used = −∞3: score empty = +∞4: placed used = False5: placed empty = False6: used PM = 07: empty PM = 08: for m in PMs do:9: if placed used is True and m is empty then:10: continue11: else:12: if v can be placed below the threshold then:13: if m is utilized then:14: calculate δRFj

based on Equation (3.9)15: if δRFj

> score used then:16: score used = δRFj

17: used PM = m18: end if19: placed used = True20: else:21: calculate RFj empty based on Equation (3.10)22: if RFj empty < score empty then:23: score empty = RFj empty24: empty PM = m25: end if26: placed empty = True27: end if28: end if29: end if30: end for31: if placed used = True then:32: return used PM33: else if placed empty = True then:34: return empty PM35: else:36: return None37: end if38: end function

3.3. PERFORMANCE EVALUATION 27

3.3 Performance Evaluation

Now we are ready to evaluate the performance of Min-DIFF through simulations. In this

section, we present the architecture of our simulator, simulation setup and evaluation

results.

3.3.1 Architecture of the Simulator

Figure 3.4 shows the architecture of our simulator. The simulator first loads VM requests

from workload traces; backlogged requests are VM requests in the current time slot. The

VMs are placed in PMs one by one based on the online placement algorithm. If there is

not enough space for a VM, this request will be discarded, and the simulator will treat

it as a failure. Different VMs have different duration time, and the simulator will update

the status of PMs once a VM is deleted.

VM requests from workload

traces

Placement Scheduler

(Algorithms)

Physical Machines

PM 1

PM 2

PM n

…

backlogged requests

Update the status of PMs

Figure 3.4: Architecture of the simulator.


3.3.2 Simulation Setup

We evaluate the performance of Min-DIFF by using four types of datasets. For the

first dataset, we consider the resource requirement of VMs to be equal to the standard

instances from general purpose applications provided by Amazon EC2. Table 3.2 presents

the seven types of T2 instances we use in our simulations. We set D=3 and use our 3-

dimensional VM placement scheme for this dataset.

For the second and third datasets, we generate the VM requests that follow the

uniform distribution and the normal distribution as Hieu et al. [15]. Table 3.3 shows the

resource requirements of each dimension of the VM requests, where U (a, b) denotes the

uniform distribution and N (µ, σ) denotes the normal distribution. We set D=4 and use

our 4-dimensional VM placement scheme for the second and third datasets.

The last one is the real-world workload trace GWA-T-12 Bitbrains [57] which contains

the performance metrics of VMs from a distributed datacenter from Bitbrains. Bitbrains

is a service provider that hosts applications used in financial fields. We extract CPU

cores and Memory requested of VMs from the traces and set the number of dimensions

D=2 for VM placement.

We consider two scenarios except for the real-world workload trace: Single request at

each time slot and multiple requests at each time slot.

Typically, cloud environments are not homogeneous and they are constructed from

different types of machines [58]. To better resemble the real-world cloud, we generate

heterogeneous PMs based on the configurations of machines shown in Reiss et al. [58] for

the first dataset and the real-world workload trace. For the second and third datasets,

we generate five types of PMs, each with the resource capacity of 200, 250, 300, 350 and

400 along all dimensions.


VM Instances CPU cores Memory(GB) Bandwidth(MBit/s)

t2.nano 1 0.5 30t2.micro 1 1 70t2.small 1 2 200

t2.medium 2 4 300t2.large 2 8 500t2.xlarge 4 16 800t2.2xlarge 8 32 1024

Table 3.2: Amazon EC2 VM instances used in the first dataset.

VM In-stances

CPUcapacity(GHz)

Memory(GB)

Bandwidth(Gbps)

Storage(GB)

U (a, b) U (20, 80) U (20, 80) U (20, 80) U (20, 80)N (µ, σ) N (50, 12) N (50, 12) N (50, 12) N (50, 12)

Table 3.3: Resource requirements used in the second and third datasets.

We evaluate our algorithm Min-DIFF in two aspects:

1. To show the effectiveness of Min-DIFF in reducing the number of PMs activated

and reducing resource fragmentation, we compare Min-DIFF with related works by

setting the threshold as 100%.

2. To show the effectiveness of the threshold-based idea of Min-DIFF in reducing the

risk of overloading, we compare our algorithm with the related works when the

threshold is smaller than 100%.

We compare Min-DIFF with the following schemes for VM placement:

• First Fit algorithm: A VM is placed in the first PM which has available resources.

• The balanced algorithm EAGLE in [16]: It first uses a multi-dimensional space

partition model to divide a PM into three domains: acceptance domain (AD), safety


domain (SD) and forbidden domain (FD). The PM whose posterior utilization

(utilization after a VM is placed) lies in the AD has the highest priority to be

selected, and the PM whose posterior utilization lies in the SD has the second

priority. If a PM’s posterior utilization lies in the FD, then it opens a new PM to

place VMs.

• Max-BRU algorithm in [15]: This algorithm uses two multi-dimensional metrics:

the resource utilization along the dth dimension and the resource balance ratio, as

the allocation criteria when it finds the best PM for VM placement.

In our simulations, to better compare the performance of different algorithms, we

make the durations of all VM requests infinitely long, which means that once they are

placed, they will not be deleted.

3.3.3 Simulation Results: Threshold 100%

First, we compare Min-DIFF with First Fit, EAGLE and Max-BRU by setting the thresh-

old as 100%, which means that we aim at packing VMs as tightly as possible when re-

sources of PMs are not over-committed. We consider the following performance metrics:

• The number of utilized PMs: K.

• The average resource fragmentation of all utilized PMs:

RF =1

K

K∑j=1

RFj. (3.11)


Simulation Results

Figure 3.5 and Figure 3.7 show the number of used PMs, each with the scenario of single

request and multi-request respectively. As we can see from the figures, Min-DIFF uses

fewer PMs than EAGLE, First Fit and Max-BRU, and the gain becomes larger as the

number of VMs increases. Figure 3.6 and Figure 3.8 gives comparison results of the

average resource fragmentation, each with the single request and multi-request scenario.

Min-DIFF achieves the lowest resource fragmentation, which means that Min-DIFF has

less resource wastage and obtains a more balanced resource utilization along different

dimensions. EAGLE and Max-BRU do not perform well in the heterogeneous setting

since they just open a new PM if they can not find available resources among the utilized

PMs, Min-DIFF uses Equation (3.10) which works for deploying VMs in the empty set

of PMs.

0 1000 2000 3000Number of VMs

0

100

200

300

400

Number o

f used PM

s

MinDIFFEAGLEFirst FitMaxBRU

(a)

0 1000 2000 3000Number of VMs

0

200

400

600

800

1000

1200

Numbe

r of u

sed PM

s


(b)

0 1000 2000 3000Number of VMs

0

200

400

600

800

1000

1200

1400

Numbe

r of u

sed PM

s


(c)

Figure 3.5: Number of used PMs for different datasets in the single request scenario: (a)Amazon EC2. (b) Uniform distribution. (c) Normal distribution.

We also give comparison results of the homogeneous setting in Figure 3.9, resource

capacity in each dimension of a PM is set to 150 and the results are from the uniform

distribution dataset when there are multiple requests in each time slot. It is obvious that

Min-DIFF can use resources more efficiently than other algorithms.


0 1000 2000 3000Number of VMs

0.12

0.14

0.16

0.18

0.20

0.22

Reso

urce

Fra

gmen

tatio

n


(a)

0 1000 2000 3000Number of VMs

0.10

0.12

0.14

0.16

0.18

0.20

0.22

Reso

urce

Fra

gmen

tatio

n

MinDIFFEAGLE

First FitMaxBRU

(b)

0 1000 2000 3000Number of VMs

0.1000.1250.1500.1750.2000.2250.2500.275

Reso

urce

Fragm

entatio

n

MinDIFFEAGLE

First FitMaxBRU

(c)

Figure 3.6: The average resource fragmentation of all used PMs for different datasetsin the single request scenario: (a) Amazon EC2. (b) Uniform distribution. (c) Normaldistribution.

0 1000 2000 3000 4000 5000Number of VMs

0

100

200

300

400

500

600

Numbe

r of u

sed PM

s


(a)

0 1000 2000 3000 4000 5000Number of VMs

0200400600800

1000120014001600

Num

ber o

f use

d PM

s


(b)

0 1000 2000 3000 4000Number of VMs

0250500750

10001250150017502000

Numbe

r of u

sed PM

s


(c)

Figure 3.7: Number of used PMs for different datasets in the multi-request scenario: (a)Amazon EC2. (b) Uniform distribution. (c) Normal distribution.

Results of the Real-World Workload Trace

We also run experiments by using the real-world trace GWA-T-12 Bitbrains. Figure 3.10

shows the comparison results. In this workload trace, there are multiple VM requests in

some time slots but only one request in some other slots. Also in this case, Min-DIFF

obtains the best performance.


0 1000 2000 3000 4000 5000Number of VMs

0.0750.1000.1250.1500.1750.2000.2250.2500.275

Reso

urce

Fra

gmen

tatio

n


(a)

0 1000 2000 3000 4000 5000Number of VMs

0.075

0.100

0.125

0.150

0.175

0.200

0.225

0.250

Reso

urce

Fra

gmen

tatio

n

MinDIFFEAGLE

First FitMaxBRU

(b)

0 1000 2000 3000 4000Number of VMs

0.05

0.10

0.15

0.20

0.25

Reso

urce

Fragm

entatio

n

MinDIFFEAGLE

First FitMaxBRU

(c)

Figure 3.8: The average resource fragmentation of all used PMs for different datasetsin the multi-request scenario: (a) Amazon EC2. (b) Uniform distribution. (c) Normaldistribution.

0 1000 2000 3000 4000 5000Number of VMs

0250500750

10001250150017502000

Numbe

r of u

sed PM

s


(a)

0 1000 2000 3000 4000 5000Number of VMs

0.10

0.12

0.14

0.16

0.18

0.20

0.22

0.24

0.26

Resource Fragm

entatio

n

MinDIFFEAGLE

First FitMaxBRU

(b)

Figure 3.9: Results of the homogeneous setting: (a) Number of used PMs. (b) Theaverage resource fragmentation of all used PMs.

3.3.4 Simulation Results: Threshold is Smaller than 100%

In this section, we will compare Min-DIFF with other approaches when the threshold

is smaller than 100%, which means that providers do not prefer too high utilization of

resources because of the overcommittment issue, then we need to use the threshold-based

idea to reduce the risk of PM overloading. We set D = 2 and use our two-dimensional

placement algorithm for deploying VMs. The dataset used in this section is Amazon EC2.

In this section, when we talk about utilization, over-committed resources are included.


200 400 600 800 1000 1200Number of VMs

0

25

50

75

100

125

150

175

Num

ber o

f use

d PM

s


(a)

200 400 600 800 1000 1200Number of VMs

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

Resource Fragm

entatio

n


(b)

Figure 3.10: Results of the real-world workload trace: (a) Number of PMs. (b) Theaverage resource fragmentation of all used PMs.

Considering the warning line is 80% along each resource dimension, we do simulations

in two scenarios:

1. Light load scenario: There are enough PMs for VM requests.

2. Heavy load scenario: PMs are not enough for VMs, if a VM can not be placed,

failure happens.

Figure 3.11 shows comparison results of the number of activated PMs and the average

resource fragmentation for the light load scenario. In this scenario, there are enough PMs

for all VM requests so that all VMs will be placed under the threshold. Since we place

VMs below the threshold by using Strategy 1, Min-DIFF uses more PMs than other

baselines. Although Min-DIFF uses more PMs, it can be seen from Figure 3.11(b), Min-

DIFF effectively achieves the most balanced use of resources along different dimensions.

Besides, other baselines do not consider PM overloading, which results in more PMs have

the risk of overloading than Min-DIFF. As shown in Figure 3.12, Min-DIFF does not have

PMs that utilizes over-committed resources higher than 80%, which can substantially

reduce the risk of PM overloading. For other baselines, the number of PMs that obtain


at least 80% utilization of resources increases as the number of requests increases. PM

overloading can substantially degrade VM performance, Min-DIFF reduces such risk

efficiently.

0 2000 4000 6000 8000Number of VMs

0

200

400

600

800

1000

1200

Num

ber o

f use

d PM

s


(a)

0 2000 4000 6000 8000Number of VMs

0.200

0.225

0.250

0.275

0.300

0.325

0.350

Resource Fragm

entatio

n


(b)

Figure 3.11: Comparison results of the light load scenario: (a) Number of PMs. (b) Theaverage resource fragmentation of all used PMs.

0 2000 4000 6000 8000Number of VMs

0.00

0.05

0.10

0.15

0.20

% o

f PM

s tha

t CPU

util

izatio

nis

high

er th

an 8

0%


(a)

0 2000 4000 6000 8000Number of VMs

0.00

0.05

0.10

0.15

0.20

0.25

0.30

% o

f PM

s tha

t mem

ory

utiliz

atio

n is

high

er th

an 8

0%


(b)

Figure 3.12: Light load scenario: the percentage of PMs that resource utilization is higherthan 80% (Over-committed resources are included): (a) CPU. (b) Memory.

For the heavy load scenario, we increase the number of VM requests and set the

number of PMs as 2500. Figure 3.13 presents the comparison results of the heavy load

scenario. If we use Min-DIFF, when the number of VM requests is larger than 16000,

3.4. SUMMARY 36

there are not enough resources for later requests below the threshold and all PMs are

activated. Thus the scheduler uses Strategy 2 for deploying VMs, the number of used

PMs does not change. As shown in Figure 3.13(b), Min-DIFF also achieves lower re-

source fragmentation than others. In the heavy load scenario, all PMs are in very high

utilization, so we do not discuss the overloading problem in this scenario.

Since the physical resources are not enough, some VMs will not be placed; failures

will happen. To show the ability to accommodate requests, we compare the number of

failures in Figure 3.14. When the number of VMs is smaller than 15000, there are enough

resources, thus we plot results when the number of VMs is larger than 15000. In this

figure, we can see that Min-DIFF has more failures than First Fit and Max-BRU when

the number of VMs is between 17900 and 19000. This is because Min-DIFF first spreads

VMs evenly below the threshold. When all PMs are utilized for Min-DIFF, there are

still empty PMs for Max-BRU and First Fit. When the number of VMs is larger than

19000, Min-DIFF has fewer failures than other approaches. When the number of VMs is

about 23800, Min-DIFF has around 2400 failures, but other approaches have more than

4000 failures. Therefore, Min-DIFF can accommodate more VM requests than other

approaches.

3.4 Summary

In this chapter, we propose an online algorithm called Min-DIFF which aims at making

a tradeoff between minimizing the number of activated PMs and reducing the risk of

PM overloading in an over-committed cloud. Besides, Min-DIFF also achieves a more

balanced use of resources along multiple resource dimensions, which significantly reduces

resource fragmentation. To better resemble the real-world scenario, we consider VM

3.4. SUMMARY 37

0 5000 10000 15000 20000Number of VMs

0

500

1000

1500

2000

2500

Numbe

r of u

sed PM

s


(a)

0 5000 10000 15000 20000Number of VMs

0.15

0.20

0.25

0.30

0.35

Reso

urce

Fra

gmen

tatio

n


(b)

Figure 3.13: Comparison results of the heavy load scenario: (a) Number of PMs. (b)The average resource fragmentation of all used PMs.

requests that come at different time slots, both homogeneous and heterogeneous settings

of PMs are considered in our simulations. A real-world workload trace and datasets

we generated in various distributions are used in our simulations. Simulation results

demonstrate that our proposed algorithm Min-DIFF achieves better performance and

accommodates more requests than other schemes in related works.

3.4. SUMMARY 38

16000 18000 20000 22000 24000Number of VMs

0

1000

2000

3000

4000

5000

Numbe

r of failures


Figure 3.14: Comparison results of the number of failures.

Chapter 4

Deadline-Aware Scheduling and

Routing for Inter-Datacenter

Multicast Transfers

In the inter-datacenter network, bandwidth between different datacenters is expensive

and scarce. How to take full advantage of available inter-datacenter resources to meet

different customer requirements becomes a great challenge. There are many applications

that need to transfer data from one datacenter to multiple datacenters, and typically

these multicast transfers are required to be completed before deadlines. Some of the

existing works only consider unicast transfers, which is not appropriate for the multicast

transmission type. Another approach used in current works is to find a minimum weight

Steiner tree for each transfer. Instead of using only one tree for each transfer, we propose

to use one or multiple trees, which increases the flexibility of routing, reduces bandwidth

wastage and enlarges the largest capacity for each transfer. In this chapter, we focus on

the multicast transmission type, propose an efficient solution which aims at maximizing

39


throughput for all transfer requests with the consideration of deadlines. We also show

that our solution can reduce packet reordering by selecting few Steiner trees for each

transfer. We have implemented our solution on a software-defined overlay network at the

application layer, real-world experiments on the Google Cloud Platform have shown that

our system effectively improves the network throughput performance and has a lower

traffic rejection rate compared to existing related works.

The remainder of this chapter is organized as follows. In Sec. 4.1, we present the

motivation of our design by using an example, talk about our design objectives and

choices. In Sec. 4.2, we talk about our solution and formulate the routing problem

of multiple multicast inter-datacenter transfers with the consideration of deadlines. In

Sec. 4.3, to show the practicability, we present our real-world implementation of our

design. In Sec. 4.4, we evaluate its validity and performance in Google Cloud Platform.

We provide our discussion about future work and conclude the chapter in Sec. 4.5

4.1 Motivation Example

Definition of meeting deadlines: In this chapter, we focus on multicast inter-datacenter

transfers, which need to send multiple copies of data from a single source to multiple des-

tinations. For each multicast transfer, we say that a transfer meets its deadline when all

destinations receive the overall data before a particular time.

Scheduling strategy: Our solution aims to pack transfer requests that arrive in

a small time interval optimally by taking full advantage of available inter-datacenter

capacities. If the available bandwidth can not accommodate all requests, then we reject

requests with lower priority and repack these requests when there are available capacities.

We do not use As Late As Possible policy because it may reduce the available resources


Requests Source Destinations Volume (MB) Deadline (seconds)R1 1 3, 4 200 40R2 4 2, 3 200 40

Table 4.1: Request requirements for the motivation example.

for future requests and achieve low throughput. We try to take full advantage of available

bandwidth to pack requests at the current scheduling time slot.

Motivation Example: Considering the directed network shown in Figure 4.1, all

link capacities are 10MB/s. There are two transfer requests R1 and R2. Table 4.1

shows detailed requirements for request R1 and R2. If we treat each multicast transfer

as multiple unicast transfers, then we can find paths from the source to each of its

destinations independently and assign a rate for each path. Figure 4.1(a) illustrates

this approach. However, link 1→ 2 becomes saturated for R1, which results in no more

bandwidth for R2 to deliver data from 4 to 2. Therefore, R2 will miss its deadline. Missing

deadlines of requests will greatly degrade service quality and violate the application SLAs.

Moreover, sometimes it will cause a great loss.

A better approach is to use Steiner trees for delivering source data to all destinations.

As we can see in Figure 4.1(b), using trees to deliver data to destinations can save more

bandwidth. Datacenter 1 sends one copy to datacenter 2; then datacenter 2 sends two

copies to destinations. Request R1 only takes 5MB/s of link 1→ 2, which leaves another

5MB/s for request R2. Therefore, both R1 and R2 will meet their deadlines.

In this chapter, we propose to use Steiner trees for multiple multicast transfers. Tradi-

tional wisdom applies Steiner tree packing but it is a NP-hard problem. We formulate the

problem as a Linear Program (LP) and use a log-based heuristic to find sparse solutions.

4.2. SYSTEM MODEL AND PROBLEM FORMULATION 42

1

2

43

5

55

Source

Destination Destination

5

Link 1-2 is saturated, no more available bandwidth for request R2!

5

Request R1Request R2

(a) Using unicast transfers.

1

2

43

5

55

5

5

Source

Destination Destination

Link 1-2 is saturated for request R1 and request R2

5

(b) Using Steiner trees.

Figure 4.1: A motivation example: (a) Finding paths from the source to each destination,request R2 will miss its deadline. (b) Using Steiner trees for transfers, both R1 and R2

can complete before deadlines.

4.2 System Model and Problem Formulation

In an inter-datacenter network, given a number of transfer requests arriving within a

small time interval, the key idea of our design is to determine the sending rate of each

request on each Steiner tree by solving a routing problem. We aim to maximize the

throughput for all requests, subject to deadline constraints. Moreover, we try to use few

Steiner trees for each request in order to reduce data splitting overhead at the source and

packet reordering at destinations. Table 4.2 presents variables we used in this chapter

and their definitions.


4.2.1 Finding Feasible Steiner Trees

Network: We model the inter-datacenter network as a directed graph G = (V,E,C).

Link capacity is assumed to be stable within a time period. C(e) denotes the available

link capacity, which is the maximum packet sending rate on edge e ∈ E.

We use Depth-First Search (DFS) to find a set of feasible Steiner trees for each

request. Nodes in trees that are pure relays are called Steiner nodes. A Steiner tree is a

distribution tree that connects the sender with receivers, possibly through Steiner nodes.

DFS starts at the source node, then explores as far as possible until it finds destinations,

otherwise it will go backward on the same path to find nodes to traverse. It will not end

until it finds all destinations. The set of feasible Steiner trees is denoted by T i:

T i = {t |t is a Steiner tree (or multicast tree) from Si to Ri}.

When the number of datacenters and destinations increases, the number of possible

Steiner trees found by DFS will be very large. In order to reduce the complexity of our

solution, we add some constraints for finding feasible trees. We classify the Steiner trees

into two types: only one path contains all destinations and other trees. Using one path

includes all destinations can save bandwidth efficiently for multicast transfers, so we keep

this kind of paths in the process of DFS. For other trees, we limit the maximum hop

number to be 2, which significantly reduces the number of possible Steiner trees with

negligible performance loss.

4.2.2 Linear Program Formulation

Request completion time: The completion time of a request is measured from the mo-

ment the source starts to send data, to the time total data are received by all destinations.


Variables MeaningT i The set of feasible Steiner trees for request i.Si Source datacenter of request i.Ri Destination datacenters of request i.Qi Data volume in bytes of request i.Di Deadline requirement of request i.ai Priority of request i.

G = (V,E,C) G denotes the inter-datacenter network graph, V and E are theset of vertices (datacenters) and edges (links) respectively. foreach e ∈ E, C(e) represents the available bandwidth capacity.

Table 4.2: Variables used in the chapter.

It includes propagation delay, queueing delay and transmission delay. Propagation delay

and queueing delay are in the order of milliseconds, since the delay-tolerant transfers are

always large transfers, then these delays are negligible. We only consider transmission

delay when we calculate the transfer completion time.

A transfer request i can be specified as a tuple {Si,Ri,Qi,Di, ai}. Large ai represents

a high priority. Our objective is to maximize the network throughput for all transfers

and meet transfer deadlines as many as we can. Some transfers may not have deadlines,

so we use very large value of deadlines for these transfers. We formulate the problem as

the following linear program:


maximize χ (4.1)

subject to χ ≤∑t∈T i

xi (t) , ∀ i = 1, ..., n, (4.2)

n∑i=1

∑t∈T i

xi (t)φ (t, e) ≤ C (e), ∀ e ∈ E, (4.3)

Di∑t∈T i

xi (t) ≥ Qi, ∀ i = 1, ..., n, (4.4)

xi (t) ≥ 0, χ ≥ 0, ∀ t ∈ T i, ∀ i = 1, ..., n. (4.5)

where φ is defined as:

φ (t, e) =

1, if e ∈ t,

0, otherwise.

The linear program we formulate above can be solved by a standard LP solver effi-

ciently. The objective of the problem is to maximize throughput for all requests, which

is the sum of flow rates in all selected Steiner trees. xi (t) represents the flow rate for a

Steiner tree t. Since flow rates of different requests contend for edge capacities, for each

edge e, the summation of trees’ flow rates that use edge e should not exceed the edge

capacity. This is reflected in constraint (4.3). Constraint (4.4) ensures that all transfers

will complete prior to deadlines. The flow rate xi (t) and throughput objective χ are

guaranteed to be non-negative in constraint (4.5).

Post-Processing Linear Program: However, it is possible that meeting all trans-

fer deadlines will exceed link capacities, so the linear program may not have feasible

solutions. When the linear program does not have feasible solutions, our approach is to


reject the transfer which has the lowest priority. If there are multiple requests with the

same priority, then we reject the request that needs the largest bandwidth.

4.2.3 Choose Sparse Solutions

The linear program we formulate above has a collection of feasible solutions. Since we

use multiple Steiner trees to deliver source data to destinations for each request, then

it is inevitable to split data at the source, which will add splitting overhead. Besides,

using multiple trees will also add packet reordering overhead at destinations. In order

to reduce such overhead, we prefer to use few trees for distributing data, which needs us

to choose sparse solutions from the feasible solutions. Therefore, we can add a penalty

function at the objective:

maximize χ− µn∑i=1

∑t∈T

g(xi (t)

), (4.6)

subject to the same constraints (4.2)−(4.5). And g (xi (t)) is defined as:

g(xi (t)

)=

0, if xi (t) = 0,

1, if xi (t) > 0.

Problem (4.6) is different from Problem (4.1) because we change the objective func-

tion. In order to get the optimal throughput and use less trees, µ should not be too

large or too small. Too large µ could make the solution far from optimality, too small

µ could lead to many trees selected. In our experiment settings, we let µ = 0.01 and

Problem (4.6) returns almost the same throughput value as Problem (4.1), the error is

smaller than 10e− 8, which can be ignored. We will show this in the experiment results.


Problem (4.6) is a non-convex optimization problem. Log-based heuristic is widely used

for finding a sparse solution, the basic idea is to replace g (xi (t)) by log (|xi (t) |+ δ),

where δ is a small positive threshold value that determines what is close to zero. Since

the problem is still not convex, we can linearize the penalty function which is inspired

by [29] by using a weighted l1-norm heuristic


∑t∈T

(W i (t) ∗ xi (t)

), (4.7)

subject to the same constraints (4.2)−(4.5). In each iteration we recalculate the weight

function W i where:

W i (t) =1

(xi (t))k + δ.

Then Problem (4.6) becomes a linear problem, and it is solved iteratively. (xi (t))k

is obtained from the kth iteration, δ is a small positive constant. We can see that if

(xi (t))k

is smaller, then the weight function W i becomes larger, xi (t) will be smaller.

Upon convergence, (xi (t))k ≈ (xi (t))

k+1= (xi (t))

∗, for i = 1, ..., n, t ∈ T, then:

W i (t) ∗(xi (t)

)∗=

(xi (t))∗

(xi (t))k + δ=

0, if (xi (t))

∗= 0,

1, if (xi (t))∗> 0.

Eventually, the transformed Problem (4.7) approaches the Problem (4.6) and yields

sparse solutions. Algorithm 4.1 presents a summary of our solution.

4.2.4 Proof of Convergence

In this section, we provide a brief proof of convergence for the Problem (4.7), which is:


Algorithm 4.1 Deadline-aware routing for multiple multicast transfers

1: Input: Transfer requests: {Si,Ri,Qi,Di, ai}; Network Topology G = (V,E,C).

2: k := 0. Initialize δ = 10−8, (W i (t))0

= 1, sparse flag = False.3: update k = k + 14: If sparse flag == False, given the solution (xi (t))

kfrom the previous iteration,

get W i (t) = 1

(xi(t))k+δ, solve the linear program (4.7) to obtain flow rates (xi (t))

k+1,

throughput optimal value χk+1 and status.5: If status == infeasible, remove the request with the lowest priority. Solve the linear

program (4.7) with updated inputs to obtain (xi (t))k+1

, χk+1 and status.

6: If status == optimal: if (xi (t))k+1 ≈ (xi (t))

k, return (xi (t))

∗ ≈ (xi (t))k+1

; else goto Step 3 for another iteration. If status == infeasible, go to Step 5.

7: Output: {xi (t)} and corresponding Steiner trees {t|t ∈ T i}.

Proposition 4.1.


∑t∈T i

xi (t)

(xi (t))k + δ(4.8)

subject to x = (x1 (t) , . . . , xn (t)) ∈ C, ∀ t ∈ T i, (4.9)

with δ> 0 and xi (t) ≥ 0, for i = 1, ..., n, where C ⊂ Rn is a convex, compact set. When

k →∞, we have (xi (t))k+1 − (xi (t))

k → 0, for all i, t ∈ T i.

Proof. Let Ni denotes the number of Steiner trees for request i. Since Problem (4.8)

yield (xi (t))k+1

, and our objective is to minimizen∑i=1

∑t∈T i

xi(t)

(xi(t))k+δ, thus we have that:

n∑i=1

∑t∈T i

(xi (t))k+1

+ δ

(xi (t))k + δ≤

n∑i=1

∑t∈T i

(xi (t))k

+ δ

(xi (t))k + δ=

n∑i=1

Ni. (4.10)


Using the inequality between the arithmetic and geometric means, we have:

1n∑i=1

Ni

n∑i=1

∑t∈T i

(xi (t))k+1

+ δ

(xi (t))k + δ≥

n∏i=1

∏t∈T i

((xi (t))

k+1+ δ

(xi (t))k + δ

) 1n∑

i=1Ni

.

(4.11)

If we combine Equation (4.10) and (4.11) together, we will get:

n∏i=1

∏t∈T i

((xi (t))

k+1+ δ

(xi (t))k + δ

) 1n∑

i=1Ni

≤ 1. (4.12)

We let

A((xi (t)

)k)=((xi (t)

)k+ δ) 1

n∑i=1

Ni. (4.13)

Since (xi (t))k ≥ 0 and δ > 0, thus A

((xi (t))

k)

is bounded below by δ

1n∑

i=1Ni

, then

A(

(xi (t))k)

will converge to a nonzero limit as k →∞, which implies that

limk→∞

n∏i=1

∏t∈T i

((xi (t))

k+1+ δ

(xi (t))k + δ

) 1n∑

i=1Ni

= 1. (4.14)

Now, we combine Equation (4.14) with Equation (4.10) and (4.11), as k →∞, we have:

n∑i=1

Ni ≤n∑i=1

∑t∈T i

(xi (t))k+1

+ δ

(xi (t))k + δ≤

n∑i=1

Ni, (4.15)

which equals to:

limk→∞

n∑i=1

∑t∈T i

(xi (t))k+1

+ δ

(xi (t))k + δ=

n∑i=1

Ni. (4.16)


Therefore, we have(xi(t))

k+1+δ

(xi(t))k+δ= 1 when k →∞, which means that (xi (t))

k+1 ≈ (xi (t))k.

Convergence proved. ut

4.2.5 An Example of the Optimal Solution

1

54

32

Steiner Trees for Request R1Steiner Trees for Request R2

Request R1 Request R2

2 1 4

2 54

1

15

12.06

10.44

Trees TreesRate Rate

5 1 3 4.56

15

5 41

32.94

2 4 1 5 3 1

Figure 4.2: An example of the optimal solution obtained by solving the linear programin Sec. 4.2.3 for maximizing the total throughput of all requests.

An example using the inter-datacenter network is shown in Figure 4.2. To simplify the

example, we assume all link capacities are 15MB/s. Considering there are two requests

R1 and R2. R1 needs to send source data from datacenter 2 to datacenter 1 and 4; R2

needs to send source data from datacenter 5 to datacenter 1 and 3. Table 4.3 gives the

detailed requirements of these two requests. We use this example to explain the benefit of

our linear programming formulation. Our linear program tries to maximize throughput

and meet deadlines for all requests, Figure 4.2 shows the optimal solution obtained by

solving the linear program in Sec. 4.2.3. Our solution will split the source data at the

sender based on the flow rate allocated on each tree, and send the data through different

trees. We can see that both requests can meet their deadlines, R2 can even finish the

4.3. IMPLEMENTATION 51

Requests Source Destinations Volume (MB) Deadline (seconds)R1 2 1, 4 300 8R2 5 1, 3 300 18

Table 4.3: Request requirements for the example.

transfer before its deadline since the linear program aims at maximizing throughput.

If we treat each multicast transfer as multiple unicast transfers, R1 will miss its

deadline, and this approach wastes a lot of bandwidth. DDCCast [27] finds only one

minimum weight Steiner tree for each request. In our example, the largest capacity for

one tree is only 15MB/s, if we use only one tree to distribute data, the shortest time to

finish the transfer will be 20s, which still makes both R1 and R2 miss their deadlines.

Using multiple trees increases throughput for a transfer, which can make more transfers

meet deadlines.

4.3 Implementation

We have completed a real-world implementation in a software-defined overlay network

testbed at the application layer. Different from the traditional SDN techniques, our

application-layer SDN does not need to cope with the complicated lower layer properties

and management. Besides, our application-layer solution has higher switching capacities,

which can support more forwarding rules at the datapath node and scale well to a large

number of transfer requests.

Figure 4.3 shows the high-level architecture of our application-layer solution. After we

start the testbed, controller and datapath nodes will establish persistent TCP connections

between each other; we use iperf to measure bandwidth information between each node

and send it to the controller, which is an important input for making routing decisions.

4.3. IMPLEMENTATION 52

aggregator

Datapath node 1

aggregator

Datapath node 1

aggregator

Datapath node 1

ControllerMaking routing decisions

Data Data Data

Figure 4.3: Architecture of the application-layer SDN design.

We employ a local aggregator at each datapath node; this aggregator helps to aggregate

and schedule inter-datacenter flows. In our experiment, we use 6 Virtual Machines (VM)

instances located in 6 different datacenters, and one of the VMs is also launched as the

central controller.

Now we will explain how an inter-datacenter transfer is routed and completed through

the application-layer SDN testbed. After a transfer request is submitted, the relative

destination nodes will firstly subscribe to a specific channel by using a subscriber API

implemented in Java, then the source node publishes its data, destinations, deadline

requirement and priority information to the channel by using a publisher API. Source

data will be aggregated at the local aggregator, then the aggregator consults the controller

for routing rules. In the controller, our routing algorithm implemented in Python will

compute routing rules by using bandwidth input and the request’s information. Two

types of routing rules will be published to each datapath node: one is {‘NodeID’: xx,

‘NextHop’:xx, ‘SessionId’: xx} which indicates the next-hop datacenter for the current

datapath node; another is {‘NodeId’: xx, ‘Weight’: xx, ‘SessionId’: xx}, the value of


‘Weight’ indicates the sending rate of the datapath node. After the aggregator gets

routing rules, if we need to use multiple trees for sending data, then source data will

be split at the source node. When data arrive at the aggregator of another node, the

aggregator will check the rule. If ‘NextHop’ is the node itself, then data will be delivered

successfully and written back to the disk. If ‘NextHop’ has a different node, then data

will be relayed by the aggregator to another node.

In our experiment, we generate some transfer requests in a small time interval and try

to send all of them before deadlines by using our routing algorithm. When a request is

rejected, the controller will make a new routing decision after there are available capacities

and send the decision to all datapath nodes.

4.4 Performance Evaluation

Now we are ready to evaluate the performance of our real-world implementation. In this

section, we present our experiment settings and evaluation results.

4.4.1 Experiment Setup

We have deployed our real-world implementation with the linear program routing algo-

rithm on Google Cloud Platform with six datacenters located geographically. In each

datacenter, we launch one Virtual Machine (VM). Locations of these datacenters are

shown in Figure 4.4.

In our deployment, we use all VM instances located in different datacenters as data-

path nodes, and VM instance in IOWA (US-central1-a) has been used as the controller

of our application-layer testbed. All VM instances are of type n1-standard-4, each has


US West(Oregon)

US Central(IOWA)

US East(North Virginia)

Europe West(London)

Asia East(Taiwan)

Asia Northeast(Taiwan)

Figure 4.4: The 6 Google Cloud datacenters used in our deployment and experiments.

4 vCPUs, 15GB memory and 10GB Solid-State Drive. In each VM instance, we run

Ubuntu 14.04 LTS system. In our experiment, we aim at showing the benefit of using

multiple Steiner trees for transfers, so we use the Linux Traffic Control (TC) to make

each inter-datacenter link has uniform 120Mbps bandwidth.

We use Linux command truncate to generate input files with a fixed size for each

request. In our experiment, when a request is submitted, destinations of requests will

first launch the Java API subscriber() to subscribe the request. After that, the VM

instance with source data will launch the Java API publisher() to read blocks of the file,

each block is 4MB, and publish blocks of data to the aggregator.

4.4.2 Evaluation Methodology

Workload: We use file replication as inter-datacenter traffic. For each transfer, we

generate the source from 6 datacenters randomly and increase the number of destinations

from 1 to 5. The volume of each file is set to be 300MB. For the deadline-constrained

transfers, we choose deadlines from a uniform distribution between [T, αT] as OWAN [53],


α represents the tightness of deadlines. When α is small, then transfers have very close

deadlines. And T is the shortest deadline of all requests, which is related to the volume

of transfer data and the number of transfers. The priority value is generated randomly

for each transfer. We run our experiments in multiple time slots, at each time slot,

six transfer requests will be generated at the beginning of the slot within a small time

interval. Then the length of each time slot is the longest deadline of requests.

Performance metrics: We measure two metrics: the inter-datacenter throughput

and the percentage of requests that meet deadlines. The inter-datacenter throughput

is obtained as the total size of all files transferred divided by the total transfer time to

finish requests, the unit is Mbps.

We compare our solution with two solutions DDCCast [27] and Amoeba [25]. DDC-

Cast finds only one tree for each request and schedule requests as late as possible. To

maximize utilization at the current time slot, it pulls some traffic to the current slot and

pushes forward other traffic close to deadlines. Since DDCCast only uses one tree for

each request, so it can not accommodate some requests with early deadlines. Amoeba

considers unicast transfers; it finds k-shortest paths for each source and destination pair.

4.4.3 Evaluation Results

Sparse solution performance: We compare the sparse solution and original linear

program without penalty in Table 4.4. From the table, we can see that the sparse

routing approach has the same optimal value as the original linear program, and it uses

much fewer trees than the original linear program.

Completion time deviation: We run the experiment in 10 time slots and get

the completion time for each transfer request. The completion time is the time from


Requests 1 2 3 4 5 6 7 8 9 10 Optimal ValueWorkload 1 2/15 5/15 2/18 2/15 2/15 3/6 1/18 3/15 1/15 5/15 5.558/5.558Workload 2 3/18 2/18 2/18 2/11 1/11 2/18 1/11 3/18 1/11 2/18 7.901/7.901

Table 4.4: Comparison of the sparse routing approach and original linear program, theleft side represents the number of trees for sparse solution, the right side represents thenumber of trees for original linear program.

−2 0 2 4 6(Actual Completion Time - Scheduled Completion Time)/s

0.0

0.2

0.4

0.6

0.8

1.0CD

F

Figure 4.5: Completion time deviation.

destinations subscribe requests to all destinations receive the source data. In order to

show our solution performs effectively in scheduling requests with deadline constraints, we

plot the CDF figure of the difference between the actual completion time and scheduled

completion time in Figure 4.5. From the figure, we observe that 80% of requests finished

before the scheduled time, the possible reason is that we use TC to set the largest

bandwidth as 120Mbps, in our routing decision, we use the same value. However, it is

possible that sometimes the bandwidth used by flows can not reach 120Mbps. So we set

the link bandwidth a little larger than 120Mbps in later experiments, which is 130Mbps.

Early deadline requests and tightness deadline factor: Some requests may

have early deadlines, our solution has better performance for these requests. When


1.5 2.0 2.5 3.0 3.5 4.0The tightness factor α

0.0

0.2

0.4

0.6

0.8

1.0

% o

f req

ues

s ha

t mee

t dea

dlin

es

Our solutionAmoebaDDCCast

Figure 4.6: Comparison of different solutions for early deadline requests.

a request requires bandwidth more than each link capacity in the network, then one

routing tree is not enough for the request to meet its deadline. We generate the value

of deadlines from a uniform distribution [T, αT]. In order to show the benefit of our

solution for requests with early deadlines, we let T = 10s and increase α from 1.2 to 4

to see the effect of the tightness factor.

Figure 4.6 presents the comparison for different solutions, and the number of desti-

nations is 2. The x-axis is the tightness factor α; the y-axis represents the percentage of

requests that meet deadlines. We can see that when the deadline ranges from 10s to 20s,

DDCCast can not accommodate such request because the largest capacity for one tree is

120Mbps. As α increases, more requests can meet their deadlines because the range of

deadlines becomes larger. Amoeba achieves lower percentage of requests that meet their

deadlines since the unicast way uses more bandwidth for each transfer than our solution.

DDCCast performs worse than Amoeba because it can not accommodate transfers that

have deadlines earlier than 20s. The comparison result shows that our solution admits


1 2 3 4 5The number of destinations

0.5

0.6

0.7

0.8

0.9

1.0

% o

f req

uest

s tha

t mee

t dea

dlin

es


Figure 4.7: Comparison of different solutions as the number of destinations increases.

more early deadline requests than DDCCast and Amoeba.

Effect of the number of destinations: We increase the number of destinations

from 1 to 5, and we set α = 2, T = 20s. Figure 4.7 shows the percentage of requests

that meet their deadlines as the number of destinations increases. As a consequence,

our solution admits more transfers than the other two solutions. When the number of

destinations increases, Amoeba does not have enough bandwidth to allocate for all source

and destination pairs. DDCCast finds a minimum weight tree for each transfer, when

the number of destinations increases, some transfers may not have space to be scheduled.

Throughput: To demonstrate the throughput improvement of our solution, we plot

the throughput performance in Figure 4.8. The average throughput is calculated as the

total file size of all requests that meet their deadlines divided by the total transfer time.

We only consider the throughput for requests that meet deadlines. Our solution has the

maximum utilization of network bandwidth and admits more transfers than the other

two solutions, so the throughput is also the highest. We can see that, when the number


1 2 3 4 5The number of destinations

200

400

600

800

1000

1200

1400

The average throug

put/M

bps


Figure 4.8: Throughput comparison of different solutions.

of destinations is 1 or 2, Amoeba has higher throughput than DDCCast. The possible

reason is DDCCast always tries to push some transfers close to deadlines, which can

make the transfer time longer than Amoeba.

Scalability: To show the scalability of our linear program with sparse solutions, we

record the running time of the LP with different number of input variables, which is

shown in Figure 4.9. The running time is the average time of multiple runs. When the

number of input variables is 900, the running time is less than 1.75s, which is acceptable

when compared with the transfer time of requests. The result proves that our solution

is efficient and converges very fast. Moreover, the number of datacenters in practice is

always small. Thus our solution is scalable.

In a nutshell, from the evaluation results, our solution maximizes the network through-

put and admits more transfers than DDCCast and Amoeba. Compared with DDCCast,

our solution can admit some requests that have early deadlines, which demand more

bandwidth than each link capacity.

4.5. SUMMARY 60

36 54 90 180 360 540 900The number of variables

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

Runn

ing tim

e/s

Figure 4.9: The computation time of our approach.

4.5 Summary

4.5.1 Discussion

We now give some discussions about our future work.

Dynamic resources: Our work assumes network resources are stable as previous

related works. However, the network resources can change dynamically over the time. In

the future work, we may consider dynamic resources in making routing decisions. The

controller will measure bandwidth information at each time slot and repack the remaining

requests under the current network resources.

Different request arrival rates: Our work does not explore the effect of request

arrival rate. Since our objective is to maximize throughput and accommodate a maximal

number of transfers with deadline requirements, we assume requests arrive in a small

time interval at the beginning of each time slot. The results show that our solution has a

good performance in routing requests that arrive closely. In the future work, we may add

4.5. SUMMARY 61

the time dimension to our formulation and explore the effect of different request arrival

rates.

4.5.2 Conclusion

In this chapter, we design a smart and efficient solution for multicast inter-datacenter

transfers, which aims at maximizing the network throughput and meeting transfer dead-

lines as many as possible. Traditionally, initializing the multicast transfer as multiple

independent unicast transfers can waste more bandwidth and let some other requests

miss their deadlines. Thus we propose to use multiple Steiner trees for each multicast

transfer. We formulate the problem as a linear program (LP) and find sparse solutions

by using a weighted l1-norm heuristic. To prove the practicality and efficiency of our

solution, we have implemented our idea in a software-defined overlay network testbed at

the application layer. Google Cloud Platform is used for real-world experiments with 6

Virtual Machine instances in 6 different datacenters. Experiment results show that our

design performs better in maximizing throughput and meeting transfer deadlines than

related existing works.

Chapter 5

Conclusion

Our focus in this thesis is to study the problem of resource optimization across geo-

graphically distributed datacenters. For cloud providers, it is essential to meet customer

requirements in time, guarantee quality of service, and reduce resource wastage. Since

different users have various resource requirements, resource optimization algorithms used

by cloud providers have a significant impact on the performance of virtual machines that

users rent for computation as well as on the ability of datacenters to accommodate user

requests. In this thesis, we propose different approaches for resource optimization in

various aspects.

Cloud computing provides a large pool of resources for users to store and process

their data. Users require the allocation of virtual machines (VMs) in datacenters to meet

their computational needs. We first study the multi-dimensional VM placement problem.

Because users always utilize less resources than reserved capacities, resource overcommit-

ment is incorporated in most cloud products to reduce resource wastage. Existing works

only consider maximizing resource utilization of PMs and minimizing the number of used

PMs to save energy, which can result in the risk of PM overloading and degrading VM

62

CHAPTER 5. CONCLUSION 63

performance. To solve this problem, we propose a threshold-based algorithm Min-DIFF

which can achieve a balanced use of resources along different dimensions and reduce the

risk of PM overloading. Extensive simulation results have shown that our algorithm

achieves better performance and accommodate more requests than related works.

As the volume of data grows, storing such data within the same datacenter is no

longer feasible, and they naturally need to be distributed across multiple datacenters.

This is further motivated by the fact that the data to be processed, such as user activity

logs, are generated in a geographically distributed fashion. Thus it is more efficient to

store the data where they are generated, which results in deploying cloud computing

resources over many datacenters in a wide area network. Because of the geographically

distributed characteristic, many applications need to process data across different data-

centers. Bandwidth between different datacenters is costly and scarce. When multiple

transfers share the same inter-datacenter link, it is challenging to do resource allocation

for these transfers and meet their requirements.

Considering that most inter-datacenter transfers need to be completed before dead-

lines, it is an important topic to meet a maximal number of deadlines. We propose to

use multiple Steiner trees for each inter-datacenter multicast transfer and formulate the

problem as a linear program with the objective of maximizing throughput for all transfer

requests with the consideration of meeting deadlines. Through experiments on Google

Cloud Platform, we have shown that our deadline-aware solution has a higher network

throughput and a lower rejection rate compared to related works.

Bibliography

[1] R. P. Goldberg, “Survey of Virtual Machine Research,” Computer, vol. 7, no. 6, pp.

34–45, 1974.

[2] X. Zhang, Z.-Y. Shae, S. Zheng, and H. Jamjoom, “Virtual Machine Migration in

an Over-Committed Cloud,” in Proc. IEEE Network Operations and Management

Symposium (NOMS), 2012.

[3] M. Dabbagh, B. Hamdaoui, M. Guizani, and A. Rayes, “Efficient Datacenter Re-

source Utilization Through Cloud Resource Overcommitment,” in Proc. IEEE Con-

ference on Computer Communications Workshops (INFOCOM WKSHPS), 2015.

[4] L. Tomas and J. Tordsson, “Improving Cloud Infrastructure Utilization Through

Overbooking,” in Proc. ACM Cloud and Autonomic Computing conference, 2013.

[5] I. Banerjee, F. Guo, K. Tati, and R. Venkatasubramanian, “Memory Overcommit-

ment in the ESX Server,” VMware Technical Journal, vol. 2, no. 1, pp. 2–12, 2013.

[6] M. A. H. Monil and A. D. Malony, “QoS-Aware Virtual Machine Consolidation in

Cloud Datacenter,” in Proc. IEEE International Conference on Cloud Engineering

(IC2E), 2017.

64

BIBLIOGRAPHY 65

[7] M. Wang, X. Meng, and L. Zhang, “Consolidating Virtual Machines with Dynamic

Bandwidth Demand in Data Centers,” in Proc. IEEE INFOCOM, 2011.

[8] X. Zhang, Y. Zhao, S. Guo, and Y. Li, “Performance-Aware Energy-efficient Virtual

Machine Placement in Cloud Data Center,” in Proc. IEEE International Conference

on Communications (ICC), 2017.

[9] H. Yanagisawa, T. Osogami, and R. Raymond, “Dependable Virtual Machine Allo-

cation,” in Proc. IEEE INFOCOM, 2013.

[10] S. Rampersaud and D. Grosu, “A Multi-Resource Sharing-Aware Approximation

Algorithm for Virtual Machine Maximization,” in Proc. IEEE International Con-

ference on Cloud Engineering (IC2E), 2015.

[11] D. Jayasinghe, C. Pu, T. Eilam, M. Steinder, I. Whally, and E. Snible, “Improv-

ing Performance and Availability of Services Hosted on Iaas Clouds with Structural

Constraint-Aware Virtual Machine Placement,” in Proc. IEEE International Con-

ference on Services Computing (SCC), 2011.

[12] F. Machida, M. Kawato, and Y. Maeno, “Redundant Virtual Machine Placement

for Fault-Tolerant Consolidated Server Clusters,” in Proc. IEEE Network Operations

and Management Symposium (NOMS), 2010.

[13] C. C. T. Mark, D. Niyato, and T. Chen-Khong, “Evolutionary Optimal Virtual Ma-

chine Placement and Demand Forecaster for Cloud Computing,” in Proc. IEEE

International Conference on Advanced Information Networking and Applications

(AINA), 2011.

BIBLIOGRAPHY 66

[14] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-Aware Resource Allocation

Heuristics for Efficient Management of Data Centers for Cloud Computing,” Fu-

ture generation computer systems, vol. 28, no. 5, pp. 755–768, 2012.

[15] N. T. Hieu, M. Di Francesco, and A. Y. Jaaski, “A Virtual Machine Placement

Algorithm for Balanced Resource Utilization in Cloud Data Centers,” in Proc. IEEE

International Conference on Cloud Computing (CLOUD), 2014.

[16] X. Li, Z. Qian, S. Lu, and J. Wu, “Energy Efficient Virtual Machine Placement

Algorithm with Balanced and Improved Resource Utilization in a Data Center,”

Mathematical and Computer Modelling, vol. 58, no. 5, pp. 1222–1235, 2013.

[17] M. Noormohammadpour, C. S. Raghavendra, and S. Rao, “DCRoute: Speeding up

Inter-Datacenter Traffic Allocation While Guaranteeing Deadlines,” in Proc. IEEE

International Conference on High Performance Computing (HiPC), 2016.

[18] C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wat-

tenhofer, “Achieving High Utilization with Software-Driven WAN,” in Proc. ACM

SIGCOMM, 2013.

[19] V. Jalaparti, I. Bliznets, S. Kandula, B. Lucier, and I. Menache, “Dynamic Pric-

ing and Traffic Engineering for Timely Inter-Datacenter Transfers,” in Proc. ACM

SIGCOMM, 2016.

[20] K. Jain, M. Mahdian, and M. R. Salavatipour, “Packing Steiner Trees,” in

Proc. ACM-SIAM symposium on Discrete algorithms, 2003.

[21] Y. Wu, P. A. Chou, and K. Jain, “A Comparison of Network Coding and Tree

Packing,” in Proc. International Symposium on Information Theory (ISIT), 2004.

BIBLIOGRAPHY 67

[22] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wan-

derer, J. Zhou, M. Zhu et al., “B4: Experience with a Globally-Deployed Software

Defined WAN,” in Proc. ACM SIGCOMM, 2013.

[23] A. Kumar, S. Jain, U. Naik, A. Raghuraman, N. Kasinadhuni, E. C. Zermeno, C. S.

Gunn, J. Ai, B. Carlin, M. Amarandei-Stavila et al., “BwE: Flexible, Hierarchical

Bandwidth Allocation for WAN Distributed Computing,” in Proc. ACM SIGCOMM,

2015.

[24] S. Kandula, I. Menache, R. Schwartz, and S. R. Babbula, “Calendaring for Wide

Area Networks,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 4,

pp. 515–526, 2015.

[25] H. Zhang, K. Chen, W. Bai, D. Han, C. Tian, H. Wang, H. Guan, and M. Zhang,

“Guaranteeing Deadlines for Inter-Data Center Transfers,” IEEE/ACM Transac-

tions on Networking, 2016.

[26] M. Noormohammadpour, C. S. Raghavendra, S. Rao, and S. Kandula, “DCCast: Ef-

ficient Point to Multipoint Transfers Across Datacenters,” in Proc. USENIX Work-

shop on Hot Topics in Cloud Computing (HotCloud), 2017.

[27] M. Noormohammadpour and C. S. Raghavendra, “DDCCast: Meeting Point to

Multipoint Transfer Deadlines Across Datacenters Using ALAP Scheduling Policy,”

arXiv preprint arXiv:1707.02027, 2017.

[28] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge university press,

2004.

BIBLIOGRAPHY 68

[29] M. S. Lobo, M. Fazel, and S. Boyd, “Portfolio Optimization with Linear and Fixed

Transaction Costs,” Annals of Operations Research, vol. 152, no. 1, pp. 341–365,

2007.

[30] Y. Kanizo, D. Hay, and I. Keslassy, “Palette: Distributing Tables in Software-

Defined Networks,” in Proc. IEEE INFOCOM, 2013.

[31] Yu, Minlan and Rexford, Jennifer and Freedman, Michael J and Wang, Jia, “Scalable

Flow-Based Networking with DIFANE,” ACM SIGCOMM Computer Communica-

tion Review, vol. 40, no. 4, pp. 351–362, 2010.

[32] B. Leng, L. Huang, X. Wang, H. Xu, and Y. Zhang, “A Mechanism for Reducing

Flow Tables in Software Defined Network,” in Proc. IEEE International Conference

on Communications (ICC), 2015.

[33] L.-H. Huang, H.-J. Hung, C.-C. Lin, and D.-N. Yang, “Scalable and Bandwidth-

Efficient Multicast for Software-Defined Networks,” in Proc. IEEE Global Commu-

nications Conference (GLOBECOM), 2014.

[34] S.-H. Shen, L.-H. Huang, D.-N. Yang, and W.-T. Chen, “Reliable Multicast Routing

for Software-Defined Networks,” in Proc. IEEE INFOCOM, 2015.

[35] X. Li, Z. Qian, R. Chi, B. Zhang, and S. Lu, “Balancing Resource Utilization for

Continuous Virtual Machine Requests in Clouds,” in Proc. IEEE International Con-

ference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS),

2012.

[36] J. Xu and J. A. Fortes, “Multi-Objective Virtual Machine Placement in Virtual-

ized Data Center Environments,” in Proc. IEEE/ACM International Conference

BIBLIOGRAPHY 69

on Green Computing and Communications & International Conference on Cyber,

Physical and Social Computing, 2010.

[37] A. C. Adamuthe, R. M. Pandharpatte, and G. T. Thampi, “Multiobjective Virtual

Machine Placement in Cloud Environment,” in Proc. IEEE International Conference

on Cloud & Ubiquitous Computing & Emerging Technologies (CUBE), 2013.

[38] F. L. Pires and B. Baran, “Multi-Objective Virtual Machine Placement with Service

Level Agreement: A Memetic Algorithm Approach,” in Proc. IEEE/ACM Interna-

tional Conference on Utility and Cloud Computing, 2013.

[39] S. Chaisiri, B.-S. Lee, and D. Niyato, “Optimal Virtual Machine Placement Across

Multiple Cloud Providers,” in Proc. IEEE Asia-Pacific Services Computing Confer-

ence, 2009.

[40] M. Sun, W. Gu, X. Zhang, H. Shi, and W. Zhang, “A Matrix Transformation Algo-

rithm for Virtual Machine Placement in Cloud,” in Proc. IEEE International Con-

ference on Trust, Security and Privacy in Computing and Communications (Trust-

Com), 2013.

[41] F. Hao, M. Kodialam, T. V. Lakshman, and S. Mukherjee, “Online Allocation of

Virtual Machines in a Distributed Cloud,” in Proc. IEEE INFOCOM, 2014.

[42] J. Dong, X. Jin, H. Wang, Y. Li, P. Zhang, and S. Cheng, “Energy-Saving Virtual

Machine Placement in Cloud Data Centers,” in Proc. IEEE/ACM International

Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013.

BIBLIOGRAPHY 70

[43] M. Mishra and A. Sahoo, “On Theory of VM Placement: Anomalies in Existing

Methodologies and Their Mitigation Using a Novel Vector Based Approach,” in

Proc. IEEE International Conference on Cloud Computing (CLOUD), 2011.

[44] M. Alicherry and T. Lakshman, “Optimizing Data Access Latencies in Cloud Sys-

tems by Intelligent Virtual Machine Placement,” in Proc. IEEE INFOCOM, 2013.

[45] X. Chen, M. Chen, B. Li, Y. Zhao, Y. Wu, and J. Li, “Celerity: A Low-Delay Multi-

Party Conferencing Solution,” in Proc. ACM international conference on Multime-

dia, 2011.

[46] Y. Feng, B. Li, and B. Li, “Airlift: Video Conferencing as a Cloud Service Using

Inter-Datacenter Networks,” in Proc. IEEE International Conference on Network

Protocols (ICNP), 2012.

[47] Y. Liu, D. Niu, and B. Li, “Delay-Optimized Video Traffic Routing in Software-

Defined Interdatacenter Networks,” IEEE Transactions on Multimedia, vol. 18,

no. 5, pp. 865–878, 2016.

[48] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez, “Inter-Datacenter Bulk

Transfers with Netstitcher,” in Proc. ACM SIGCOMM, 2011.

[49] N. Laoutaris, G. Smaragdakis, R. Stanojevic, P. Rodriguez, and R. Sundaram,

“Delay-Tolerant Bulk Data Transfers on the Internet,” IEEE/ACM Transactions

on Networking (TON), vol. 21, no. 6, pp. 1852–1865, 2013.

[50] B. B. Chen and P. V.-B. Primet, “Scheduling Deadline-Constrained Bulk Data

Transfers to Minimize Network Congestion,” in Proc. IEEE International Sympo-

sium on Cluster Computing and the Grid (CCGRID), 2007.

BIBLIOGRAPHY 71

[51] Y. Wang, S. Su, A. X. Liu, and Z. Zhang, “Multiple Bulk Data Transfers Scheduling

Among Datacenters,” Computer Networks, vol. 68, pp. 123–137, 2014.

[52] Y. Wu, Z. Zhang, C. Wu, C. Guo, Z. Li, and F. C. Lau, “Orchestrating Bulk

Data Transfers Across Geo-Distributed Datacenters,” IEEE Transactions on Cloud

Computing, vol. 5, no. 1, pp. 112–125, 2015.

[53] X. Jin, Y. Li, D. Wei, S. Li, J. Gao, L. Xu, G. Li, W. Xu, and J. Rexford, “Optimizing

Bulk Transfers with Software-Defined Optical WAN,” in Proc. ACM SIGCOMM,

2016.

[54] Personal communication with HUAWEI company.

[55] Y. Gao, H. Guan, Z. Qi, Y. Hou, and L. Liu, “A Multi-Objective Ant Colony

System Algorithm for Virtual Machine Placement in Cloud Computing,” Journal of

Computer and System Sciences, vol. 79, no. 8, pp. 1230–1242, 2013.

[56] F. Ma, F. Liu, and Z. Liu, “Multi-Objective Optimization for Initial Virtual Machine

Placement in Cloud Data center,” Journal of Information &Computational Science,

vol. 9, no. 16, pp. 5029–5038, 2012.

[57] “Bitbrain Workload Traces,” http://gwa.ewi.tudelft.nl/datasets/gwa-t-12-bitbrains.

[58] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, “Hetero-

geneity and Dynamicity of Clouds at Scale: Google Trace Analysis,” in Proc. ACM

Symposium on Cloud Computing, 2012.

resource optimization across geographically distributed ... · 1.1 virtual machine placement...

Documents