resource optimization across geographically distributed ... · 1.1 virtual machine placement...
TRANSCRIPT
RESOURCE OPTIMIZATION ACROSS
GEOGRAPHICALLY DISTRIBUTED
DATACENTERS
by
Siqi Ji
A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science
Edward S. Rogers Sr. Dept. of Electrical and Computer EngineeringUniversity of Toronto
c© Copyright 2017 by Siqi Ji
Resource Optimization Across Geographically
Distributed Datacenters
Master of Applied Science ThesisEdward S. Rogers Sr. Dept. of Electrical and Computer Engineering
University of Toronto
by Siqi Ji2017
Abstract
Cloud computing provides users and enterprises shared pools of resources to store and
process their data. To provide quality-of-service guarantees for customers and reduce re-
source wastage, how to manage resources becomes a crucial problem for cloud providers.
In this thesis, we propose and implement different approaches for resource optimization
with various objectives. Resource optimization algorithms used by cloud providers have
a significant impact on the performance of virtual machines (VMs) that users rent for
computation as well as the ability for datacenters to accommodate user requests. We pro-
pose a multi-dimensional online VM placement algorithm that can balance the usage of
resources along multiple dimensions and improve VM performance effectively. There are
also some applications like geo-replication that need to transfer data across datacenters
within a time period. We propose and implement an efficient solution that maximizes
throughput for multiple concurrent inter-datacenter multicast transfers while meeting
their deadlines.
ii
TO MY PARENTS
iii
Acknowledgments
First, and most importantly, I would like to express my deepest appreciation to my
thesis supervisor, Professor Baochun Li, for the continuous support of my master study
and research at University of Toronto. I benefited a lot from his immense knowledge,
sharp visions, and scientific insights. He always gives me valuable advice not only for my
research but also for my career.
I also would like to thank my examination committee: Professor Ben Liang, Profes-
sor Shahrokh Valaee, and Professor Cristiana Amza, for their insightful comments and
advice.
Third, I would like to thank all the group members in iQua research group: Xu Yuan,
Jun Li, Zhiming Hu, Liyao Xiang, Li Chen, Shuhao Liu, Wenxin Li, Yinan Liu, Hao
Wang and Wanyu Lin. They are like my family and always there to help me out. I
learned a lot from them. Also, they make my master life in U of T more fun and more
fulfilling.
Last but not the least, I want to thank my family — my father Youhui Ji, my mother
Xuexiang Wang and my fiance Puwen Chen. I can not do this without their support,
understanding, and love. They never give up on me and always encourage me to do my
best. There are not enough words to express my love for them.
iv
Contents
Abstract ii
Acknowledgments iv
Contents v
List of Tables viii
List of Figures ix
1 Introduction 1
1.1 Virtual Machine Placement . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Inter-Datacenter Multicast Transfers with Deadlines . . . . . . . . . . . . 5
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Related Work 9
2.1 Virtual Machine Placement . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Inter-Datacenter Multicast Transfers with Deadlines . . . . . . . . . . . . 11
3 An Online Virtual Machine Placement Algorithm
in an Over-Committed Cloud 14
v
CONTENTS CONTENTS
3.1 Motivation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Min-DIFF: An Online VM Placement Algorithm . . . . . . . . . . . . . . 16
3.2.1 Resource Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.2 Find the Best PM for Single VM Request . . . . . . . . . . . . . 19
3.2.3 VM Selection for Multiple VM Requests . . . . . . . . . . . . . . 23
3.2.4 Details of Min-DIFF . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.1 Architecture of the Simulator . . . . . . . . . . . . . . . . . . . . 27
3.3.2 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.3 Simulation Results: Threshold 100% . . . . . . . . . . . . . . . . 30
3.3.4 Simulation Results: Threshold is Smaller than 100% . . . . . . . . 33
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 Deadline-Aware Scheduling and Routing for Inter-Datacenter Multicast
Transfers 39
4.1 Motivation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 System Model and Problem Formulation . . . . . . . . . . . . . . . . . . 42
4.2.1 Finding Feasible Steiner Trees . . . . . . . . . . . . . . . . . . . . 43
4.2.2 Linear Program Formulation . . . . . . . . . . . . . . . . . . . . . 43
4.2.3 Choose Sparse Solutions . . . . . . . . . . . . . . . . . . . . . . . 46
4.2.4 Proof of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.5 An Example of the Optimal Solution . . . . . . . . . . . . . . . . 50
4.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.1 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
vi
CONTENTS CONTENTS
4.4.2 Evaluation Methodology . . . . . . . . . . . . . . . . . . . . . . . 54
4.4.3 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.5.2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5 Conclusion 62
Bibliography 64
vii
List of Tables
3.1 Variables used in this chapter. . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Amazon EC2 VM instances used in the first dataset. . . . . . . . . . . . 29
3.3 Resource requirements used in the second and third datasets. . . . . . . . 29
4.1 Request requirements for the motivation example. . . . . . . . . . . . . . 41
4.2 Variables used in the chapter. . . . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Request requirements for the example. . . . . . . . . . . . . . . . . . . . 51
4.4 Comparison of the sparse routing approach and original linear program,
the left side represents the number of trees for sparse solution, the right
side represents the number of trees for original linear program. . . . . . . 56
viii
List of Figures
3.1 A motivation example of VM placement. . . . . . . . . . . . . . . . . . . 16
3.2 A sketch of the threshold-based idea. . . . . . . . . . . . . . . . . . . . . 18
3.3 An illustrative example of resource fragmentation. . . . . . . . . . . . . . 20
3.4 Architecture of the simulator. . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Number of used PMs for different datasets in the single request scenario:
(a) Amazon EC2. (b) Uniform distribution. (c) Normal distribution. . . 31
3.6 The average resource fragmentation of all used PMs for different datasets
in the single request scenario: (a) Amazon EC2. (b) Uniform distribution.
(c) Normal distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.7 Number of used PMs for different datasets in the multi-request scenario:
(a) Amazon EC2. (b) Uniform distribution. (c) Normal distribution. . . 32
3.8 The average resource fragmentation of all used PMs for different datasets
in the multi-request scenario: (a) Amazon EC2. (b) Uniform distribution.
(c) Normal distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.9 Results of the homogeneous setting: (a) Number of used PMs. (b) The
average resource fragmentation of all used PMs. . . . . . . . . . . . . . . 33
ix
LIST OF FIGURES LIST OF FIGURES
3.10 Results of the real-world workload trace: (a) Number of PMs. (b) The
average resource fragmentation of all used PMs. . . . . . . . . . . . . . . 34
3.11 Comparison results of the light load scenario: (a) Number of PMs. (b)
The average resource fragmentation of all used PMs. . . . . . . . . . . . 35
3.12 Light load scenario: the percentage of PMs that resource utilization is
higher than 80% (Over-committed resources are included): (a) CPU. (b)
Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.13 Comparison results of the heavy load scenario: (a) Number of PMs. (b)
The average resource fragmentation of all used PMs. . . . . . . . . . . . 37
3.14 Comparison results of the number of failures. . . . . . . . . . . . . . . . . 38
4.1 A motivation example: (a) Finding paths from the source to each destina-
tion, request R2 will miss its deadline. (b) Using Steiner trees for transfers,
both R1 and R2 can complete before deadlines. . . . . . . . . . . . . . . 42
4.2 An example of the optimal solution obtained by solving the linear program
in Sec. 4.2.3 for maximizing the total throughput of all requests. . . . . . 50
4.3 Architecture of the application-layer SDN design. . . . . . . . . . . . . . 52
4.4 The 6 Google Cloud datacenters used in our deployment and experiments. 54
4.5 Completion time deviation. . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.6 Comparison of different solutions for early deadline requests. . . . . . . . 57
4.7 Comparison of different solutions as the number of destinations increases. 58
4.8 Throughput comparison of different solutions. . . . . . . . . . . . . . . . 59
4.9 The computation time of our approach. . . . . . . . . . . . . . . . . . . . 60
x
Chapter 1
Introduction
In the era of big data analytics, as the volume of data grows exponentially, the need
for data processing becomes more pressing than ever before. Cloud computing enables
cheap and easy access to shared pools of computational resources, which provides users
and enterprises with capabilities to store and process their data efficiently and reliably.
Different kinds of applications can be hosted in datacenters by renting virtual ma-
chines. These applications are characterized by diverse resource requirements across
multiple dimensions such as memory, CPU cores, storage space, and network bandwidth.
Due to the variety of resource requirements, resource management involves some non-
trivial challenges. One of the challenges is how to deploy virtual machines (VMs) that
requested by applications in datacenters. Improper VM packing schemes can cause re-
source wastage and VM performance degradation. Applications running in a datacenter
rely heavily on the performance of VMs. Thus it is significant to guarantee each VM
request gets its fair share of resources.
Another challenge is how to take full advantage of available inter-datacenter resources
to meet more customer requirements. To increase availability and reduce latency for end
1
1.1. VIRTUAL MACHINE PLACEMENT 2
users, large companies or cloud providers are deploying from tens to hundreds of datacen-
ters around the world in a geographical fashion. Many applications like geo-replication
need to deliver multiple copies of data from a single datacenter to multiple datacenters,
which has benefits of improving fault tolerance, increasing availability and achieving high
service quality. These applications usually require completing multicast transfers before
certain deadlines. Due to the limitation of bandwidth between datacenters, how to al-
locate and schedule bandwidth for inter-datacenter transfers with the goal of meeting
customer requirements becomes a crucial issue.
Therefore, in this thesis, we present our solutions concerning these two challenges in
resource optimization. In the following sections, for both of these two problems, we give
a brief overview of the background, the limitations of existing solutions, and present our
contributions. Finally, the organization of this thesis is illustrated.
1.1 Virtual Machine Placement
Virtualization [1] is an important technology of cloud computing, it separates physical
hardware resources to create virtual machines (VMs) with dedicated resources. Virtual
machine (VM) acts like a real computer with an operating system. Through virtualiza-
tion, it is possible to run multiple VMs on the same physical machine (PM) at the same
time while increasing efficiency and utilization of hardware resources.
VM placement is the process of selecting the most appropriate PM for deploying
VMs. Since different users have diverse resource requirements for VMs and resources on
each PM are limited, improper VM placement can cause unbalanced resource utilization
(overloaded in some resources but underutilized in others). Such resource fragmentation
requires extra PMs and wastes more resources. Therefore, it is crucial to balance PM
1.1. VIRTUAL MACHINE PLACEMENT 3
resources along multiple dimensions (i.e., CPU, memory, storage, network bandwidth,
etc.) during VM placement and minimize the number of activated PMs.
Existing works [2–4] indicated that VMs tend to utilize less resources than reserved
capacities, which causes substantial resource wastage. Resource Overcommitment
is widely used for solving this wastage problem by allocating more resources to VMs
than they actually have. For example, a PM has 64GB memory and it is sold as 128GB
memory, we say that the overcommit ratio of this PM is 2. Commercial cloud man-
agement products such as VMware ESX Server extensively incorporates resource over-
commitment [5]. It could be problematic if we only consider the problem of minimizing
the number of activated PMs and reduce resource fragmentation in the over-committed
cloud. While resource overcommitment increases resource utilization and benefits the
cloud service providers, the risk of provider-induced overload is also increased. This hap-
pens when users actually demand enough of resources such that collectively they exhaust
all available physical resources, which leads to VM performance degradation and could
drive off users. As such, it is of interest to consider resource overcommitment during VM
placement, such that the risk of service degradation is reduced while resource utilization
is enhanced.
There has been a significant amount of work on VM placement. Some existing
works [6–9] only considered one resource dimension when they deploy VM placement,
these approaches overlooked the multi-dimensional nature of the problem and did not
balance resources along different dimensions. Other works solved the problem in an of-
fline optimization manner [10–14]: resource requirements are known in advance. First-fit
Decreasing is a well-known heuristic, which first sorts the size of VMs. These approaches
did not adequately reflect the true nature of VM requests, with unpredictable arrivals
1.1. VIRTUAL MACHINE PLACEMENT 4
and departures. Other approaches like [15, 16] studied the VM placement problem with
the goal of achieving balanced resource utilization along multiple dimensions and mini-
mizing the number of physical machines activated, which has very close objectives with
our work. However, they did not raise the PM overloading problem in an over-committed
cloud as their concern. The overcommitment issue is typically considered in VM migra-
tion [2, 3] while there were very few works that took this problem into account in the
process of VM placement.
For the challenge of deploying VMs, we solve the VM placement problem with the
objective of balancing the use of resources along multiple dimensions and reducing the
risk of PM overloading while considering the over-committed issue. Our contributions
are the following:
First, we consider multiple dimensions of resources in VM placement, and both homo-
geneous and heterogeneous settings regarding PM resource capacity configurations are
generated in our simulations, which realistically resembles modern cloud datacenters.
Second, we consider the model that VM deployment requests arrive and depart dy-
namically, in an online manner where resource requirements are not known beforehand.
This dynamic nature produces a realistic scenario.
Third, we propose a threshold-based algorithm called Min-DIFF, which considers
resource overcommitment in an effort to reduce the risk of service degradation. Besides,
Min-DIFF obtains a more balanced use of resources along different dimensions than
related works, which reduces resource fragmentation effectively.
Finally, our simulations are driven by both real-world traces and datasets we gener-
ated, to evaluate the effectiveness of Min-DIFF under a wider spectrum of conditions.
Our simulation results show that Min-DIFF has better performance in three aspects:
1.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 5
First, Min-DIFF uses fewer PMs and achieves lower resource fragmentation than other
approaches if we do not take the overcommitment issue into account. Second, Min-DIFF
has a lower risk of PM overloading compared to other approaches in an over-committed
cloud. Third, Min-DIFF has a greater ability in accommodating VM requests than re-
lated works.
1.2 Inter-Datacenter Multicast Transfers with Dead-
lines
To improve the fault tolerance, increase availability and achieve high service quality, many
applications require efficient data transfers from one datacenter to multiple datacenters,
typically for data replication, database synchronization, and data backup. For example,
search engines need to synchronize databases regularly for the purpose of achieving a
higher quality of user experience [17]. Blocks of a file in many distributed file systems
like HDFS are replicated for fault tolerance.
Inter-datacenter transfers can roughly be classified into three categories based on their
delay tolerance: interactive transfers, elastic transfers and background transfers [18].
Interactive transfers, like video streams and web requests, are highly sensitive to loss and
delay, so they should be delivered instantly with strictly higher priority. Elastic transfers
are delay tolerant but still require timely delivery (before a deadline). For example, many
applications need to back up data every certain time. Background transfers such as data
warehousing does not have explicit deadlines.
Why do we need to consider transfer deadlines? When multiple inter-
datacenter transfers are sharing the same links in the inter-datacenter network, the total
1.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 6
demand for these transfers typically far exceeds the available network capacity. On the
one hand, some transfers like elastic transfers need to be completed timely, which can be
modeled as deadlines. On the other hand, cloud providers set deadlines for most transfers
based on their delay tolerance to different customer service level agreements (SLAs). A
survey of WAN customers at Microsoft [19] shows that most transfers require deadlines
and it would incur penalty if deadlines are missed. Customers are willing to pay more
for guaranteed deadlines. Therefore, it is an important topic to meet transfer deadlines
as many as possible.
For resource optimization in the inter-datacenter network, we focus on elastic trans-
fers and background transfers that deliver data from one datacenters to multiple dat-
acenters. We propose an efficient solution to maximize the network throughput and
consider transfer deadlines at the same time. The multicast (one-to-many) transfer type
is quite representative, other transmission types like unicast (one-to-one) and broadcast
(one-to-all) can be transformed into it.
Traditional wisdom used Steiner Tree Packing [20,21] to maximize the flow rate from
a source to multiple destinations, which is a NP-complete problem. Another approach is
to treat the multicast transmission as multiple unicast transfers. Existing solutions like
B4 [22], SWAN [18] and BwE [23] aimed to maximize utilization and focus on max-min
fairness. Tempus [24] designed a strategy to maximize the minimum fraction of transfers
finished before deadlines. Amoeba [25] guaranteed deadlines, it introduced a deadline-
based network abstraction for inter-datacenter transfers. DCRoute [17] scheduled each
transfer with a single path to avoid packet reordering and it also guaranteed transfer
deadlines for admitted requests.
Unfortunately, these solutions were not explicitly designed for multicast transfers,
1.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 7
which can actually waste bandwidth by finding paths from the source to each destination,
result in rejecting more transfer requests with deadline requirements. DCCast [26] and
DDCCast [27] proposed to use minimum weight Steiner Trees and DDCCast used As
Late As Possible (ALAP) policy for rate allocation. However, DCCast and DDCCast
used one minimum weight Steiner tree for a request, which reduced the flexibility of
choosing routing paths. Besides, if the bandwidth required by a request with a specific
deadline is higher than the maximum available bandwidth in the network, the request
will be rejected by only choosing one tree. However, if we can split the traffic at the
source and use multiple trees for delivering data, then the request can meet its deadline
with higher throughput. Moreover, in the admission control part of DDCCast, a request
can be rejected although it could have been admitted by choosing other forwarding trees.
DDCCast and DDCast did not aim to achieve maximizing throughput.
To tackle the problem, we design a new routing and scheduling algorithm for multiple
multicast data transfers across geo-distributed datacenters to maximize network through-
put, with the consideration of transfer deadlines. We have implemented our solution in
an application-layer software-defined inter-datacenter network. We also evaluate the per-
formance of our solution with real-world experiments in the Google Cloud Platform. Our
contributions are the following:
First, prior works on inter-datacenter traffic engineering [17, 18, 22–25] focused on
unicast transfers, which are not effective for multicast transfers. We propose to use
Steiner trees for each multicast transfer.
Second, prior work on multicast inter-datacenter transfers [26, 27] used one tree for
each transfer, which could reject some transfer requests that have early deadlines. Our
solution has higher flexibility for routing and uses at least one tree for each transfer.
1.3. THESIS ORGANIZATION 8
We formulate the problem as a Linear Program (LP), which can pack multiple multicast
transfers with deadlines efficiently and achieve high throughput. Besides, to reduce packet
reordering overhead at the destination, we add a penalty function in the objective of LP
and use a log-based heuristic [28, 29] to find sparse solutions.
Third, prior work on multicast techniques [30–34] used software-defined networking
(SDN) at the network layer. However, hardware switches in each datacenter can only
support a limited number of forwarding entries. Besides, it is complicated and costly
to solve the flow table scalability problem at large scales. We have implemented our
solution in an application-layer software-defined networking (SDN), which does not need
to modify the underlying network properties and can scale up to a large number of
transfer requests.
Fourth, our real-world experiment results over Google Cloud Platform have shown
that our solution performs higher throughput and accommodates more transfer requests
with deadlines as compared with the existing related works that consider deadlines.
1.3 Thesis Organization
The remainder of this thesis is organized as follows. In Chapter 2, we discuss the re-
lated works regarding these two problems. To balance the usage of resources across
multiple dimensions and reduce the risk of PM overloading, in Chapter 3 we propose
a threshold-based online VM placement algorithm. In Chapter 4, we consider the re-
source optimization challenge across different datacenters. We propose to use multiple
Steiner trees for multicast transfers with the purpose of maximizing network throughput
and meeting transfer deadlines as many as possible. Finally, we summarize our work in
Chapter 5.
Chapter 2
Related Work
Problems of resource optimization across datacenters have been extensively studied in
the field of cloud computing. In this chapter, we first present related works about deploy-
ing VM requests in datacenters and then talk about existing works related to resource
optimization for inter-datacenter transfers.
2.1 Virtual Machine Placement
VM placement can be formulated as a bin packing problem which is proved to be NP-
hard [35]. Many heuristics have been proposed to solve this issue. A widely used approach
is the First Fit heuristic, which allocates each VM request to the first available PM. The
limitation of First Fit is that resources on PMs may be imbalanced. Some approaches
mainly focused on one resource type as the allocation criterion. Min-Min and Max-Min
are two well-known heuristics which assign a VM request based on the CPU capacity.
Monil et al. [6] proposed a multi-pass Best Fit Decreasing VM placement algorithm
that achieved a balance between energy consumption and quality of service. It did
9
2.1. VIRTUAL MACHINE PLACEMENT 10
not consider multiple dimensions of resources, rather focusing on CPU utilization only.
Wang et al. [7] formulated the VM placement problem into a Stochastic Bin Packing
problem, which used random variables to characterize the uncertain future bandwidth
usage for each VM. However, within their probabilistic characterization, only bandwidth
was considered. Zhang et al. [8] formulated the problem into a constrained minimum
k-cut problem, under the constraint of VM performance. Nevertheless, the solution was
based on energy consumption as a singular criterion, instead of the specific PM resources.
These approaches did not consider to balance the use of multiple resources, which could
make one resource in high utilization but other resources under-utilized.
Other approaches solved the VM placement problem in an offline manner, they as-
sumed that the VM requests are known beforehand. Beloglazov et al. [14] modified the
Best Fit Decreasing algorithm for deploying VMs, it sorted all VM requests in a decreas-
ing order of their CPU requirements. Most related works like [9, 36–40] formulated the
VM placement problem as an optimization problem which is not suitable for dynamic VM
requests. In detail, Xu et al. [36] formulated the static VM placement scenario as a multi-
objective optimization problem and expanded on a two-level control genetic algorithm to
achieve high scalability and robustness. Adamuthe et al. [37] also solved a multi-objective
optimization problem with maximizing profit, maximizing load balance and minimizing
resource wastage. Yanagisawa et al. [9] presented a mixed integer programming approach
for the optimal placement of VMs in respect of minimizing PM resources while guaran-
teeing fault-tolerance. Besides, it mainly focused on CPU resources. OVMP [39] used
the optimal solution of stochastic integer programming (SIP) to minimize the cost for
hosting virtual machines in multiple cloud providers while considering future demand
and price uncertainty. Rampersaud et al. [10] designed an approximation algorithm that
2.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 11
took multi-dimensional resources into account for maximizing profit derived from hosting
VMs. However, in real scenarios, requests are coming at different time slots, and it is
hard to know VM requests in advance.
There are a lot of related works that proposed approaches for the online VM place-
ment [15, 16, 41–44] with the consideration of multi-dimensional resources, but few con-
sidered the overcommitment issue in VM placement. Alicherry et al. [44] optimized
data access latencies by using an intelligent VM placement algorithm. Dong [42] com-
bined minimum cut with best-fit to design a novel greedy algorithm that reduced the
number of activated physical servers and network elements to achieve the energy-saving
goal. Mishara et al. [43] proposed a methodology for dynamic VM placement based on
vector arithmetic. Max-BRU [15] considered multiple resource types, focused on maxi-
mizing resource utilization and meanwhile balanced the use of multiple types of resources.
EAGLE [16] proposed a multi-dimensional space partition model to balance resource uti-
lization along different dimensions while minimizing total energy consumed by running
PMs. Overcommitment is mainly considered in VM migration [2, 3]. In this thesis, we
use a threshold-based idea to efficiently reduce the risk of PM overloading caused by
overcommitment, which also reduces overhead in migration.
2.2 Inter-Datacenter Multicast Transfers with Dead-
lines
There are a lot of related works on datacenter traffic engineering or deadline-aware rout-
ing. Video streaming is one type of multicast transfers, which needs to deliver the video
content from a single source to other users in remote regions. This kind of transfer is
2.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 12
highly delay-sensitive. Celerity [45] packed only depth-1 and depth-2 trees, Airlift [46]
maximized throughput without violating end-to-end delay constraints by using network
coding and Liu et al. [47] proposed a delay-optimized routing scheme by only solving
linear programs. However, these works were explicitly designed for delay-sensitive video
streaming. We focus on elastic and background transfers which are delay-tolerant and
some of them have deadlines.
Some existing works focused on improving performance for bulk transfers. Laoutaris et
al. [48] proposed NetStitcher which minimized the completion time of bulk transfers by
stitching unutilized bandwidth and employing a store-and-forward algorithm. They ex-
tended their work in [49] by considering the time-zone difference for delay-tolerant bulk
transfers. Chen et al. [50] considered bulk transfers with deadlines in grid networks.
Store-and-forward is also used in [51, 52] to complete transfers. Wang et al. [51] aimed
to minimize network congestion of deadline-constrained bulk transfers. Wu et al. [52]
concentrated on a per-chunk routing scheme. Storing data at the intermediate datacen-
ters will increase the storage cost and transfer overhead. Owan [53] jointly optimized
bulk transfers in optical and network layers. These works only considered unicast bulk
transfers.
Google B4 [22] and Microsoft SWAN [18] used SDN among inter-datacenters for
traffic engineering to maximize network throughput. BwE [23] provided work-conserving
bandwidth allocation and focused on max-min fairness. These works did not consider
transfer deadlines. Tempus [24] proposed an online scheduling scheme to maximize the
minimum fraction of inter-datacenter transfers finished before deadlines. Ameoba [25]
and DCRoute [17] guaranteed deadlines for admitted requests but they were not explicitly
designed for multicast transfers.
2.2. INTER-DATACENTER MULTICAST TRANSFERS WITH DEADLINES 13
DCCast [26] chose minimum weight forwarding trees for transfers and it focused on
multicast transfers. DDCCast [27] was based on DCCast which took transfer deadlines
into consideration. Our work differs from DDCCast because our solution chooses at least
one tree for each transfer, which can accommodate more transfer requests than using
exactly one tree and achieve higher throughput.
Chapter 3
An Online Virtual Machine
Placement Algorithm
in an Over-Committed Cloud
Public cloud providers such as Amazon EC2 and Google Cloud Platform provide users
with a massive pool of PMs to create different types of VMs for storing and processing
data. In the cloud computing, after users submit their VM requests, VM placement is
conducted to select the most suitable PM to host each VM. The performance varies for
different VM placement schemes. Most of the existing works only consider maximizing
the resource utilization of PMs without taking the overcommitment issue into account,
which can cause PM overloading and degrade VM performance. In this chapter, we
propose an algorithm, called Min-DIFF, that can balance the usage of resources along
multiple dimensions and reduce the risk of PM overloading effectively.
This chapter is organized as follows. In Sec. 3.1, we motivate our work by using
a simple example. We discuss how to deploy VM requests and illustrate details of our
14
3.1. MOTIVATION EXAMPLE 15
algorithm Min-DIFF in Sec. 3.2. In Sec. 3.3, we explain the architecture of our simulator,
present our simulation setup and show the simulation results by using different datasets.
Finally, we conclude the chapter in Sec. 3.4.
3.1 Motivation Example
Previous works on VM placement try to maximize the utilization of PMs and pack VMs
as tightly as possible. However, resource overcommitment may cause PM overloading
when total resources utilized by VMs do exceed the PM’s actual capacities. We use an
example to illustrate PM overloading, and we only consider memory in this example for
simplification. As shown in Figure 3.1(a), considering there are three VM requests: VM1,
VM2, and VM3, each requires memory of 32GB, 24GB, and 16GB, respectively. The
memory capacity of the PM is 36GB, and it is sold as 72GB with the overcommit ratio of
2. If we pack all VMs in this PM and these VMs utilize 60% of their required resources,
then this PM will be overloaded. Overloading can substantially degrade VM performance,
and some VMs will not get their fair share of resources. A better approach is to set an
80% threshold of 72GB memory, total resources of VMs placed in the PM can not exceed
this threshold. As we can see from Figure 3.1(b), only VM1 and VM2 are placed in this
PM, VM3 will be placed in another PM. In this way, resources utilized by VMs will not
exceed the PM’s capacities, all VMs can work well and get good performance. Related
works only consider the overcommitment issue in VM migration. Nevertheless, migrating
VMs in overloaded PMs can cause extra overhead and increase bandwidth usage. We
save overhead and network bandwidth by considering overcommitment in the VM initial
placement. We will discuss the setting of the threshold later.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 16
Actual resourcesVM1
VM2
VM3
(a) Packing VMs as tightly as possible.
Actual resourcesVM1
VM2
80%threshold
(b) Setting a 80% threshold for the over-committed PM.
Figure 3.1: A motivation example of VM placement.
3.2 Min-DIFF: An Online VM Placement Algorithm
In this section, we present our proposed threshold-based online VM placement algorithm
Min-DIFF, which can reduce resource fragmentation efficiently with the consideration of
the overcommitment issue.
Figure 3.2 gives a sketch of the threshold-based idea. Grey squares represent different
VMs. We deploy VMs by using one of the two strategies shown in Figure 3.2. In Strategy
1, we select the most appropriate PM to place VMs under the threshold. If we can not
find space under the threshold, then we use Strategy 2 to place VMs without considering
the threshold. Strategy 1 always has the highest priority. Table 3.1 presents variables
we used in this chapter and their definitions. A VM request i can be denoted as a tuple
{ai, dui,VMdi }.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 17
Variables MeaningD The number of resource dimensions.d The index of resource dimensions, d = 1, ...,D.j The index of PMs.i The index of VM requests.Udj Used resource along dimension d of the jth PM.
PMdj Total resource along dimension d of the jth PM.
VMdi Resource requirement along dimension d of the ith VM re-
quest.ai Arrival time of the ith VM request.dui The duration of the ith VM request.wd The warning line of PMs along dimension d.Ld The largest VM resource requirement along dimension d.Thdj The threshold along dimension d of the jth PM.
RFj Resource fragmentation of the jth PM.NRd
j The normalized residual resource along dimension d of the jthPM.
NUdj The normalized used resource along dimension d of the jth
PM.
Table 3.1: Variables used in this chapter.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 18
Strategy 1: Place VMs below the threshold.
Strategy 2: Place VMs without considering the threshold.
threshold
threshold
PM1 PM2 PM3 PM1 PM2 PM3
PM1 PM2 PM3 PM1 PM2 PM3
Figure 3.2: A sketch of the threshold-based idea.
3.2.1 Resource Threshold
Typically, to guarantee performance for most VMs and reduce the risk of PM overloading,
some providers do not expect the utilization of over-committed PMs is higher than a
specific percentage [54], and we call this percentage a warning line in this chapter.
For example, if a PM is sold as 72GB memory with the overcommit ratio of 2 and the
warning line of memory is 80%, then the total memory of VMs in this PM should not
exceed 57.6GB. The warning line is considered when we set the resource threshold in
Min-DIFF.
On the other hand, to reduce resource fragmentation, we reserve enough space for
large VMs above the threshold. Otherwise, if we can not find enough space below the
threshold and need to use Strategy 2, a large VM can not be placed in the PM, which
will cause large resource fragmentation. Therefore, based on the warning line wd and the
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 19
largest VM requirement Ld, the threshold Thdj is defined as:
Thdj= min
{PMd
j − Ld
PMdj
, wd
}. (3.1)
3.2.2 Find the Best PM for Single VM Request
If there is one VM request at each time slot, we find the best PM for this request. We
will illustrate how to choose the best PM in this section.
In order to place more future VM requests, we try to balance resources along multiple
dimensions left on each PM. Otherwise, if residual resources along one dimension become
unavailable, resources along other dimensions are wasted. Such resource fragmentation
will prevent future VM requests and waste resources.
As shown in Figure 3.3, we use a simple example to illustrate the concept of resource
fragmentation. The biggest rectangle represents the total CPU and memory capacities
of a PM. Three VMs are deployed in this PM. The three small rectangles denote the
amount of CPU and memory allocated to each VM. Once a VM is placed in the PM, the
available resource capacity is reduced along each dimension. In this example, we can see
that the PM has a lot of available CPU resource but very little unused memory, which
prevents further VM requests placed in this PM because of the lack of enough memory.
Considering a datacenter provides a pool of resources such as CPU, memory, network
bandwidth and storage, we deal with multiple resource types in VM placement. Inspired
by previous works [36, 55, 56], we extend the resource wastage model to multiple dimen-
sions, which is not specific to two dimensions. The following equation is used to measure
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 20
VM1
VM2
VM3
CPU
Memory
Residual resources
Figure 3.3: An illustrative example of resource fragmentation.
the resource fragmentation of a PM:
RFj =
∑p,p 6=m
(NRp
j −NRmj
)D∑d=1
NUdj
, (3.2)
where RFj represents the resource fragmentation of the jth PM. NRmj indicates the
smallest normalized residual resource. NRpj denotes the normalized residual resource
along dimension p, and p does not equal to m. Therefore the numerator calculates the
sum of differences between the smallest normalized residual resource and the others. The
denominator represents the sum of the normalized used resource along each dimension.
When computing the resource fragmentation, the residual resource as well as the used
resource on the PM is normalized by the PM’s overall capacity. It is evident that the
more used resource and more balanced residual resource along different dimensions, the
resource fragmentation value is smaller.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 21
To reduce resource fragmentation and pack VMs tightly, we propose an efficient al-
gorithm based on the resource fragmentation Equation (3.2). Intuitively, an idea is to
choose the PM that has the minimal resource fragmentation value after a VM is placed.
However, this idea is problematic for utilized PMs. For example, considering there are
two utilized PMs and a VM needs to be placed. If we place the VM in PM1, RF1 = 0.3;
if we place the VM in PM2, RF2 = 0.2, then PM2 is selected. Nevertheless, this idea
does not consider the resource fragmentation value before the VM is placed. if RF1 = 0.6
and RF2 = 0.1 before the VM is placed, deploying the VM in PM2 actually makes the
resource fragmentation value higher.
A proper approach is: for non-empty PMs, we deploy a VM in the PM that has the
largest resource fragmentation reduction. Before a VM is placed, the normalized used
resource is:
NUdj bef =
Udj
PMdj
, d = 1, ...,D. (3.3)
The normalized residual resource is:
NRdj bef = 1− NUd
j bef, d = 1, ...,D. (3.4)
The smallest value among NRdj bef is NRm
j bef , then the initial resource fragmentation
is:
RFj bef =
∑p,p 6=m
(NRp
j bef −NRmj bef
)D∑d=1
NUdj bef
. (3.5)
Similarly, after a VM is placed, the normalized used resource is:
NUdj aft =
Udj + VMd
j
PMdj
, d = 1, ...,D. (3.6)
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 22
The normalized residual resource is:
NRdj aft = 1− NUd
j aft, d = 1, ...,D. (3.7)
We still find the smallest value among NRdj aft, which is denoted as NRm
j aft. Therefore
the resource fragmentation value after deploying a VM is:
RFj aft =
∑p,p 6=m
(NRp
j aft−NRmj aft
)D∑d=1
NUdj aft
. (3.8)
Then we calculate the difference of resource fragmentation before a VM is placed and
resource fragmentation after a VM is placed, which is:
δRFj= RFj bef −RFj aft. (3.9)
For non-empty PMs, we choose the PM that has the largest δRFj. For empty PMs, we
select the PM with the most balanced utilization along resource dimensions. Thus, we
calculate the differences between the smallest normalized residual resource NRmj and the
others NRpj , choose the PM that has the smallest RFj empty.
RFj empty =∑p,p 6=m
(NRp
j −NRmj
). (3.10)
To pack VMs tightly, we first find the most appropriate PM among non-empty PMs,
if there is no available utilized PM, we choose the best PM among empty PMs.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 23
3.2.3 VM Selection for Multiple VM Requests
If there are multiple VM requests at each time slot, we do the placement as follows.
When resources are available on a PM, we choose the set of VM requests at the current
time slot whose resource requirements can be accommodated on that PM. If the PM is
utilized, we compute δRFjto the PM for each VM request in this set. The request with
the largest δRFjwill be placed in that PM. If the PM is empty, we compute RFj empty
and choose the VM with the smallest value to place in that PM. This process is repeated
recursively until the PM can not accommodate any VM requests in the current time slot.
Then we go to the next PM to place other VM requests.
3.2.4 Details of Min-DIFF
Min-DIFF is illustrated by Algorithm 3.1. First of all, we calculate the threshold for
each PM based on Equation (3.1) (line 2-4). If there are multiple requests at the time
slot, we use Strategy 1 to place current VMs below the threshold by calling the func-
tion PlaceCurVMsBlwTh(current V Ms), which is presented by Algorithm 3.2. If
there is not enough space for all current VMs, we use Strategy 2 where function Place-
CurVMs(current V Ms) is called. PlaceCurVms(current V Ms) is similar to Place-
CurVMsBlwTh(current V Ms), the difference is line 9, it judges whether the PM is
able to accommodate current unplaced VMs (does not consider the threshold).
If there is only one VM request at the time slot, we choose the best PM for this VM.
Function FindBestPM(v, PMs) uses Strategy 2, which is similar to Strategy 1: Find-
BestPMBlwTh(v, PMs) (Algorithm 3.3). The difference between these two functions
is line 12, for FindBestPM(v, PMs), it judges whether there is enough space in a PM
for the VM request.
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 24
Algorithm 3.1 Min-DIFF algorithm
1: function VMplacement(VMs, PMs)2: for m in PMs do:3: calculate threshold based on Equation (3.1)4: end for5: while VMs 6= ∅ do6: current VMs = requests at the current time slot7: remove current VMs from VMs8: if length(current V Ms) > 1 then:9: flag, current V Ms10: =PlaceCurVMsBlwTh(current V Ms)11: if flag is False then:12: PlaceCurVms(current VMs)13: end if14: else if length(current V Ms) = 1 then:15: for v in current VMs do16: BestPM = FindBestPMBlwTh(v, PMs)17: if BestPM is not None then:18: Place VM v on BestPM19: Remove VM v from current VMs20: continue21: end if22: BestPM = FindBestPM(v, PMs)23: if BestPM is not None then:24: Place VM v on BestPM25: Remove VM v from current VMs26: end if27: end for28: end if29: end while30: end function
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 25
Algorithm 3.2 Place multiple current VM requests below the threshold
1: function PlaceCurVMsBlwTh(current V Ms)2: for m in PMs do:3: while True do:4: score used = −∞5: score empty = +∞6: placed = False7: selected VM = 08: for v in current VMs do:9: if v can be placed below the threshold then:10: placed = True11: if m is utilized then:12: calculate δRFj
based on Equation (3.9)13: if δRFj
> score used then:14: score used = δRFj
15: selected VM = v16: end if17: else:18: calculate RFj empty based on Equation (3.10)19: if RFj empty < score empty then:20: score empty = RFj empty21: selected VM = v22: end if23: end if24: end if25: end for26: if placed then:27: Place VM v in PM m28: remove VM v from current VMs29: else:30: break31: end if32: end while33: end for34: if length(current VMs)=0 then:35: return True, current VMs36: else:37: return False, current VMs38: end if39: end function
3.2. MIN-DIFF: AN ONLINE VM PLACEMENT ALGORITHM 26
Algorithm 3.3 Find the best PM below the threshold
1: function FindBestPMBlwTh(v, PMs)2: score used = −∞3: score empty = +∞4: placed used = False5: placed empty = False6: used PM = 07: empty PM = 08: for m in PMs do:9: if placed used is True and m is empty then:10: continue11: else:12: if v can be placed below the threshold then:13: if m is utilized then:14: calculate δRFj
based on Equation (3.9)15: if δRFj
> score used then:16: score used = δRFj
17: used PM = m18: end if19: placed used = True20: else:21: calculate RFj empty based on Equation (3.10)22: if RFj empty < score empty then:23: score empty = RFj empty24: empty PM = m25: end if26: placed empty = True27: end if28: end if29: end if30: end for31: if placed used = True then:32: return used PM33: else if placed empty = True then:34: return empty PM35: else:36: return None37: end if38: end function
3.3. PERFORMANCE EVALUATION 27
3.3 Performance Evaluation
Now we are ready to evaluate the performance of Min-DIFF through simulations. In this
section, we present the architecture of our simulator, simulation setup and evaluation
results.
3.3.1 Architecture of the Simulator
Figure 3.4 shows the architecture of our simulator. The simulator first loads VM requests
from workload traces; backlogged requests are VM requests in the current time slot. The
VMs are placed in PMs one by one based on the online placement algorithm. If there is
not enough space for a VM, this request will be discarded, and the simulator will treat
it as a failure. Different VMs have different duration time, and the simulator will update
the status of PMs once a VM is deleted.
VM requests from workload
traces
Placement Scheduler
(Algorithms)
Physical Machines
PM 1
PM 2
PM n
…
backlogged requests
Update the status of PMs
Figure 3.4: Architecture of the simulator.
3.3. PERFORMANCE EVALUATION 28
3.3.2 Simulation Setup
We evaluate the performance of Min-DIFF by using four types of datasets. For the
first dataset, we consider the resource requirement of VMs to be equal to the standard
instances from general purpose applications provided by Amazon EC2. Table 3.2 presents
the seven types of T2 instances we use in our simulations. We set D=3 and use our 3-
dimensional VM placement scheme for this dataset.
For the second and third datasets, we generate the VM requests that follow the
uniform distribution and the normal distribution as Hieu et al. [15]. Table 3.3 shows the
resource requirements of each dimension of the VM requests, where U (a, b) denotes the
uniform distribution and N (µ, σ) denotes the normal distribution. We set D=4 and use
our 4-dimensional VM placement scheme for the second and third datasets.
The last one is the real-world workload trace GWA-T-12 Bitbrains [57] which contains
the performance metrics of VMs from a distributed datacenter from Bitbrains. Bitbrains
is a service provider that hosts applications used in financial fields. We extract CPU
cores and Memory requested of VMs from the traces and set the number of dimensions
D=2 for VM placement.
We consider two scenarios except for the real-world workload trace: Single request at
each time slot and multiple requests at each time slot.
Typically, cloud environments are not homogeneous and they are constructed from
different types of machines [58]. To better resemble the real-world cloud, we generate
heterogeneous PMs based on the configurations of machines shown in Reiss et al. [58] for
the first dataset and the real-world workload trace. For the second and third datasets,
we generate five types of PMs, each with the resource capacity of 200, 250, 300, 350 and
400 along all dimensions.
3.3. PERFORMANCE EVALUATION 29
VM Instances CPU cores Memory(GB) Bandwidth(MBit/s)
t2.nano 1 0.5 30t2.micro 1 1 70t2.small 1 2 200
t2.medium 2 4 300t2.large 2 8 500t2.xlarge 4 16 800t2.2xlarge 8 32 1024
Table 3.2: Amazon EC2 VM instances used in the first dataset.
VM In-stances
CPUcapacity(GHz)
Memory(GB)
Bandwidth(Gbps)
Storage(GB)
U (a, b) U (20, 80) U (20, 80) U (20, 80) U (20, 80)N (µ, σ) N (50, 12) N (50, 12) N (50, 12) N (50, 12)
Table 3.3: Resource requirements used in the second and third datasets.
We evaluate our algorithm Min-DIFF in two aspects:
1. To show the effectiveness of Min-DIFF in reducing the number of PMs activated
and reducing resource fragmentation, we compare Min-DIFF with related works by
setting the threshold as 100%.
2. To show the effectiveness of the threshold-based idea of Min-DIFF in reducing the
risk of overloading, we compare our algorithm with the related works when the
threshold is smaller than 100%.
We compare Min-DIFF with the following schemes for VM placement:
• First Fit algorithm: A VM is placed in the first PM which has available resources.
• The balanced algorithm EAGLE in [16]: It first uses a multi-dimensional space
partition model to divide a PM into three domains: acceptance domain (AD), safety
3.3. PERFORMANCE EVALUATION 30
domain (SD) and forbidden domain (FD). The PM whose posterior utilization
(utilization after a VM is placed) lies in the AD has the highest priority to be
selected, and the PM whose posterior utilization lies in the SD has the second
priority. If a PM’s posterior utilization lies in the FD, then it opens a new PM to
place VMs.
• Max-BRU algorithm in [15]: This algorithm uses two multi-dimensional metrics:
the resource utilization along the dth dimension and the resource balance ratio, as
the allocation criteria when it finds the best PM for VM placement.
In our simulations, to better compare the performance of different algorithms, we
make the durations of all VM requests infinitely long, which means that once they are
placed, they will not be deleted.
3.3.3 Simulation Results: Threshold 100%
First, we compare Min-DIFF with First Fit, EAGLE and Max-BRU by setting the thresh-
old as 100%, which means that we aim at packing VMs as tightly as possible when re-
sources of PMs are not over-committed. We consider the following performance metrics:
• The number of utilized PMs: K.
• The average resource fragmentation of all utilized PMs:
RF =1
K
K∑j=1
RFj. (3.11)
3.3. PERFORMANCE EVALUATION 31
Simulation Results
Figure 3.5 and Figure 3.7 show the number of used PMs, each with the scenario of single
request and multi-request respectively. As we can see from the figures, Min-DIFF uses
fewer PMs than EAGLE, First Fit and Max-BRU, and the gain becomes larger as the
number of VMs increases. Figure 3.6 and Figure 3.8 gives comparison results of the
average resource fragmentation, each with the single request and multi-request scenario.
Min-DIFF achieves the lowest resource fragmentation, which means that Min-DIFF has
less resource wastage and obtains a more balanced resource utilization along different
dimensions. EAGLE and Max-BRU do not perform well in the heterogeneous setting
since they just open a new PM if they can not find available resources among the utilized
PMs, Min-DIFF uses Equation (3.10) which works for deploying VMs in the empty set
of PMs.
0 1000 2000 3000Number of VMs
0
100
200
300
400
Number o
f used PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 1000 2000 3000Number of VMs
0
200
400
600
800
1000
1200
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(b)
0 1000 2000 3000Number of VMs
0
200
400
600
800
1000
1200
1400
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(c)
Figure 3.5: Number of used PMs for different datasets in the single request scenario: (a)Amazon EC2. (b) Uniform distribution. (c) Normal distribution.
We also give comparison results of the homogeneous setting in Figure 3.9, resource
capacity in each dimension of a PM is set to 150 and the results are from the uniform
distribution dataset when there are multiple requests in each time slot. It is obvious that
Min-DIFF can use resources more efficiently than other algorithms.
3.3. PERFORMANCE EVALUATION 32
0 1000 2000 3000Number of VMs
0.12
0.14
0.16
0.18
0.20
0.22
Reso
urce
Fra
gmen
tatio
n
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 1000 2000 3000Number of VMs
0.10
0.12
0.14
0.16
0.18
0.20
0.22
Reso
urce
Fra
gmen
tatio
n
MinDIFFEAGLE
First FitMaxBRU
(b)
0 1000 2000 3000Number of VMs
0.1000.1250.1500.1750.2000.2250.2500.275
Reso
urce
Fragm
entatio
n
MinDIFFEAGLE
First FitMaxBRU
(c)
Figure 3.6: The average resource fragmentation of all used PMs for different datasetsin the single request scenario: (a) Amazon EC2. (b) Uniform distribution. (c) Normaldistribution.
0 1000 2000 3000 4000 5000Number of VMs
0
100
200
300
400
500
600
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 1000 2000 3000 4000 5000Number of VMs
0200400600800
1000120014001600
Num
ber o
f use
d PM
s
MinDIFFEAGLEFirst FitMaxBRU
(b)
0 1000 2000 3000 4000Number of VMs
0250500750
10001250150017502000
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(c)
Figure 3.7: Number of used PMs for different datasets in the multi-request scenario: (a)Amazon EC2. (b) Uniform distribution. (c) Normal distribution.
Results of the Real-World Workload Trace
We also run experiments by using the real-world trace GWA-T-12 Bitbrains. Figure 3.10
shows the comparison results. In this workload trace, there are multiple VM requests in
some time slots but only one request in some other slots. Also in this case, Min-DIFF
obtains the best performance.
3.3. PERFORMANCE EVALUATION 33
0 1000 2000 3000 4000 5000Number of VMs
0.0750.1000.1250.1500.1750.2000.2250.2500.275
Reso
urce
Fra
gmen
tatio
n
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 1000 2000 3000 4000 5000Number of VMs
0.075
0.100
0.125
0.150
0.175
0.200
0.225
0.250
Reso
urce
Fra
gmen
tatio
n
MinDIFFEAGLE
First FitMaxBRU
(b)
0 1000 2000 3000 4000Number of VMs
0.05
0.10
0.15
0.20
0.25
Reso
urce
Fragm
entatio
n
MinDIFFEAGLE
First FitMaxBRU
(c)
Figure 3.8: The average resource fragmentation of all used PMs for different datasetsin the multi-request scenario: (a) Amazon EC2. (b) Uniform distribution. (c) Normaldistribution.
0 1000 2000 3000 4000 5000Number of VMs
0250500750
10001250150017502000
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 1000 2000 3000 4000 5000Number of VMs
0.10
0.12
0.14
0.16
0.18
0.20
0.22
0.24
0.26
Resource Fragm
entatio
n
MinDIFFEAGLE
First FitMaxBRU
(b)
Figure 3.9: Results of the homogeneous setting: (a) Number of used PMs. (b) Theaverage resource fragmentation of all used PMs.
3.3.4 Simulation Results: Threshold is Smaller than 100%
In this section, we will compare Min-DIFF with other approaches when the threshold
is smaller than 100%, which means that providers do not prefer too high utilization of
resources because of the overcommittment issue, then we need to use the threshold-based
idea to reduce the risk of PM overloading. We set D = 2 and use our two-dimensional
placement algorithm for deploying VMs. The dataset used in this section is Amazon EC2.
In this section, when we talk about utilization, over-committed resources are included.
3.3. PERFORMANCE EVALUATION 34
200 400 600 800 1000 1200Number of VMs
0
25
50
75
100
125
150
175
Num
ber o
f use
d PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
200 400 600 800 1000 1200Number of VMs
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Resource Fragm
entatio
n
MinDIFFEAGLEFirst FitMaxBRU
(b)
Figure 3.10: Results of the real-world workload trace: (a) Number of PMs. (b) Theaverage resource fragmentation of all used PMs.
Considering the warning line is 80% along each resource dimension, we do simulations
in two scenarios:
1. Light load scenario: There are enough PMs for VM requests.
2. Heavy load scenario: PMs are not enough for VMs, if a VM can not be placed,
failure happens.
Figure 3.11 shows comparison results of the number of activated PMs and the average
resource fragmentation for the light load scenario. In this scenario, there are enough PMs
for all VM requests so that all VMs will be placed under the threshold. Since we place
VMs below the threshold by using Strategy 1, Min-DIFF uses more PMs than other
baselines. Although Min-DIFF uses more PMs, it can be seen from Figure 3.11(b), Min-
DIFF effectively achieves the most balanced use of resources along different dimensions.
Besides, other baselines do not consider PM overloading, which results in more PMs have
the risk of overloading than Min-DIFF. As shown in Figure 3.12, Min-DIFF does not have
PMs that utilizes over-committed resources higher than 80%, which can substantially
reduce the risk of PM overloading. For other baselines, the number of PMs that obtain
3.3. PERFORMANCE EVALUATION 35
at least 80% utilization of resources increases as the number of requests increases. PM
overloading can substantially degrade VM performance, Min-DIFF reduces such risk
efficiently.
0 2000 4000 6000 8000Number of VMs
0
200
400
600
800
1000
1200
Num
ber o
f use
d PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 2000 4000 6000 8000Number of VMs
0.200
0.225
0.250
0.275
0.300
0.325
0.350
Resource Fragm
entatio
n
MinDIFFEAGLEFirst FitMaxBRU
(b)
Figure 3.11: Comparison results of the light load scenario: (a) Number of PMs. (b) Theaverage resource fragmentation of all used PMs.
0 2000 4000 6000 8000Number of VMs
0.00
0.05
0.10
0.15
0.20
% o
f PM
s tha
t CPU
util
izatio
nis
high
er th
an 8
0%
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 2000 4000 6000 8000Number of VMs
0.00
0.05
0.10
0.15
0.20
0.25
0.30
% o
f PM
s tha
t mem
ory
utiliz
atio
n is
high
er th
an 8
0%
MinDIFFEAGLEFirst FitMaxBRU
(b)
Figure 3.12: Light load scenario: the percentage of PMs that resource utilization is higherthan 80% (Over-committed resources are included): (a) CPU. (b) Memory.
For the heavy load scenario, we increase the number of VM requests and set the
number of PMs as 2500. Figure 3.13 presents the comparison results of the heavy load
scenario. If we use Min-DIFF, when the number of VM requests is larger than 16000,
3.4. SUMMARY 36
there are not enough resources for later requests below the threshold and all PMs are
activated. Thus the scheduler uses Strategy 2 for deploying VMs, the number of used
PMs does not change. As shown in Figure 3.13(b), Min-DIFF also achieves lower re-
source fragmentation than others. In the heavy load scenario, all PMs are in very high
utilization, so we do not discuss the overloading problem in this scenario.
Since the physical resources are not enough, some VMs will not be placed; failures
will happen. To show the ability to accommodate requests, we compare the number of
failures in Figure 3.14. When the number of VMs is smaller than 15000, there are enough
resources, thus we plot results when the number of VMs is larger than 15000. In this
figure, we can see that Min-DIFF has more failures than First Fit and Max-BRU when
the number of VMs is between 17900 and 19000. This is because Min-DIFF first spreads
VMs evenly below the threshold. When all PMs are utilized for Min-DIFF, there are
still empty PMs for Max-BRU and First Fit. When the number of VMs is larger than
19000, Min-DIFF has fewer failures than other approaches. When the number of VMs is
about 23800, Min-DIFF has around 2400 failures, but other approaches have more than
4000 failures. Therefore, Min-DIFF can accommodate more VM requests than other
approaches.
3.4 Summary
In this chapter, we propose an online algorithm called Min-DIFF which aims at making
a tradeoff between minimizing the number of activated PMs and reducing the risk of
PM overloading in an over-committed cloud. Besides, Min-DIFF also achieves a more
balanced use of resources along multiple resource dimensions, which significantly reduces
resource fragmentation. To better resemble the real-world scenario, we consider VM
3.4. SUMMARY 37
0 5000 10000 15000 20000Number of VMs
0
500
1000
1500
2000
2500
Numbe
r of u
sed PM
s
MinDIFFEAGLEFirst FitMaxBRU
(a)
0 5000 10000 15000 20000Number of VMs
0.15
0.20
0.25
0.30
0.35
Reso
urce
Fra
gmen
tatio
n
MinDIFFEAGLEFirst FitMaxBRU
(b)
Figure 3.13: Comparison results of the heavy load scenario: (a) Number of PMs. (b)The average resource fragmentation of all used PMs.
requests that come at different time slots, both homogeneous and heterogeneous settings
of PMs are considered in our simulations. A real-world workload trace and datasets
we generated in various distributions are used in our simulations. Simulation results
demonstrate that our proposed algorithm Min-DIFF achieves better performance and
accommodates more requests than other schemes in related works.
3.4. SUMMARY 38
16000 18000 20000 22000 24000Number of VMs
0
1000
2000
3000
4000
5000
Numbe
r of failures
MinDIFFEAGLEFirst FitMaxBRU
Figure 3.14: Comparison results of the number of failures.
Chapter 4
Deadline-Aware Scheduling and
Routing for Inter-Datacenter
Multicast Transfers
In the inter-datacenter network, bandwidth between different datacenters is expensive
and scarce. How to take full advantage of available inter-datacenter resources to meet
different customer requirements becomes a great challenge. There are many applications
that need to transfer data from one datacenter to multiple datacenters, and typically
these multicast transfers are required to be completed before deadlines. Some of the
existing works only consider unicast transfers, which is not appropriate for the multicast
transmission type. Another approach used in current works is to find a minimum weight
Steiner tree for each transfer. Instead of using only one tree for each transfer, we propose
to use one or multiple trees, which increases the flexibility of routing, reduces bandwidth
wastage and enlarges the largest capacity for each transfer. In this chapter, we focus on
the multicast transmission type, propose an efficient solution which aims at maximizing
39
4.1. MOTIVATION EXAMPLE 40
throughput for all transfer requests with the consideration of deadlines. We also show
that our solution can reduce packet reordering by selecting few Steiner trees for each
transfer. We have implemented our solution on a software-defined overlay network at the
application layer, real-world experiments on the Google Cloud Platform have shown that
our system effectively improves the network throughput performance and has a lower
traffic rejection rate compared to existing related works.
The remainder of this chapter is organized as follows. In Sec. 4.1, we present the
motivation of our design by using an example, talk about our design objectives and
choices. In Sec. 4.2, we talk about our solution and formulate the routing problem
of multiple multicast inter-datacenter transfers with the consideration of deadlines. In
Sec. 4.3, to show the practicability, we present our real-world implementation of our
design. In Sec. 4.4, we evaluate its validity and performance in Google Cloud Platform.
We provide our discussion about future work and conclude the chapter in Sec. 4.5
4.1 Motivation Example
Definition of meeting deadlines: In this chapter, we focus on multicast inter-datacenter
transfers, which need to send multiple copies of data from a single source to multiple des-
tinations. For each multicast transfer, we say that a transfer meets its deadline when all
destinations receive the overall data before a particular time.
Scheduling strategy: Our solution aims to pack transfer requests that arrive in
a small time interval optimally by taking full advantage of available inter-datacenter
capacities. If the available bandwidth can not accommodate all requests, then we reject
requests with lower priority and repack these requests when there are available capacities.
We do not use As Late As Possible policy because it may reduce the available resources
4.1. MOTIVATION EXAMPLE 41
Requests Source Destinations Volume (MB) Deadline (seconds)R1 1 3, 4 200 40R2 4 2, 3 200 40
Table 4.1: Request requirements for the motivation example.
for future requests and achieve low throughput. We try to take full advantage of available
bandwidth to pack requests at the current scheduling time slot.
Motivation Example: Considering the directed network shown in Figure 4.1, all
link capacities are 10MB/s. There are two transfer requests R1 and R2. Table 4.1
shows detailed requirements for request R1 and R2. If we treat each multicast transfer
as multiple unicast transfers, then we can find paths from the source to each of its
destinations independently and assign a rate for each path. Figure 4.1(a) illustrates
this approach. However, link 1→ 2 becomes saturated for R1, which results in no more
bandwidth for R2 to deliver data from 4 to 2. Therefore, R2 will miss its deadline. Missing
deadlines of requests will greatly degrade service quality and violate the application SLAs.
Moreover, sometimes it will cause a great loss.
A better approach is to use Steiner trees for delivering source data to all destinations.
As we can see in Figure 4.1(b), using trees to deliver data to destinations can save more
bandwidth. Datacenter 1 sends one copy to datacenter 2; then datacenter 2 sends two
copies to destinations. Request R1 only takes 5MB/s of link 1→ 2, which leaves another
5MB/s for request R2. Therefore, both R1 and R2 will meet their deadlines.
In this chapter, we propose to use Steiner trees for multiple multicast transfers. Tradi-
tional wisdom applies Steiner tree packing but it is a NP-hard problem. We formulate the
problem as a Linear Program (LP) and use a log-based heuristic to find sparse solutions.
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 42
1
2
43
5
55
Source
Destination Destination
5
Link 1-2 is saturated, no more available bandwidth for request R2!
5
Request R1Request R2
(a) Using unicast transfers.
1
2
43
5
55
5
5
Source
Destination Destination
Link 1-2 is saturated for request R1 and request R2
5
(b) Using Steiner trees.
Figure 4.1: A motivation example: (a) Finding paths from the source to each destination,request R2 will miss its deadline. (b) Using Steiner trees for transfers, both R1 and R2
can complete before deadlines.
4.2 System Model and Problem Formulation
In an inter-datacenter network, given a number of transfer requests arriving within a
small time interval, the key idea of our design is to determine the sending rate of each
request on each Steiner tree by solving a routing problem. We aim to maximize the
throughput for all requests, subject to deadline constraints. Moreover, we try to use few
Steiner trees for each request in order to reduce data splitting overhead at the source and
packet reordering at destinations. Table 4.2 presents variables we used in this chapter
and their definitions.
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 43
4.2.1 Finding Feasible Steiner Trees
Network: We model the inter-datacenter network as a directed graph G = (V,E,C).
Link capacity is assumed to be stable within a time period. C(e) denotes the available
link capacity, which is the maximum packet sending rate on edge e ∈ E.
We use Depth-First Search (DFS) to find a set of feasible Steiner trees for each
request. Nodes in trees that are pure relays are called Steiner nodes. A Steiner tree is a
distribution tree that connects the sender with receivers, possibly through Steiner nodes.
DFS starts at the source node, then explores as far as possible until it finds destinations,
otherwise it will go backward on the same path to find nodes to traverse. It will not end
until it finds all destinations. The set of feasible Steiner trees is denoted by T i:
T i = {t |t is a Steiner tree (or multicast tree) from Si to Ri}.
When the number of datacenters and destinations increases, the number of possible
Steiner trees found by DFS will be very large. In order to reduce the complexity of our
solution, we add some constraints for finding feasible trees. We classify the Steiner trees
into two types: only one path contains all destinations and other trees. Using one path
includes all destinations can save bandwidth efficiently for multicast transfers, so we keep
this kind of paths in the process of DFS. For other trees, we limit the maximum hop
number to be 2, which significantly reduces the number of possible Steiner trees with
negligible performance loss.
4.2.2 Linear Program Formulation
Request completion time: The completion time of a request is measured from the mo-
ment the source starts to send data, to the time total data are received by all destinations.
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 44
Variables MeaningT i The set of feasible Steiner trees for request i.Si Source datacenter of request i.Ri Destination datacenters of request i.Qi Data volume in bytes of request i.Di Deadline requirement of request i.ai Priority of request i.
G = (V,E,C) G denotes the inter-datacenter network graph, V and E are theset of vertices (datacenters) and edges (links) respectively. foreach e ∈ E, C(e) represents the available bandwidth capacity.
Table 4.2: Variables used in the chapter.
It includes propagation delay, queueing delay and transmission delay. Propagation delay
and queueing delay are in the order of milliseconds, since the delay-tolerant transfers are
always large transfers, then these delays are negligible. We only consider transmission
delay when we calculate the transfer completion time.
A transfer request i can be specified as a tuple {Si,Ri,Qi,Di, ai}. Large ai represents
a high priority. Our objective is to maximize the network throughput for all transfers
and meet transfer deadlines as many as we can. Some transfers may not have deadlines,
so we use very large value of deadlines for these transfers. We formulate the problem as
the following linear program:
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 45
maximize χ (4.1)
subject to χ ≤∑t∈T i
xi (t) , ∀ i = 1, ..., n, (4.2)
n∑i=1
∑t∈T i
xi (t)φ (t, e) ≤ C (e), ∀ e ∈ E, (4.3)
Di∑t∈T i
xi (t) ≥ Qi, ∀ i = 1, ..., n, (4.4)
xi (t) ≥ 0, χ ≥ 0, ∀ t ∈ T i, ∀ i = 1, ..., n. (4.5)
where φ is defined as:
φ (t, e) =
1, if e ∈ t,
0, otherwise.
The linear program we formulate above can be solved by a standard LP solver effi-
ciently. The objective of the problem is to maximize throughput for all requests, which
is the sum of flow rates in all selected Steiner trees. xi (t) represents the flow rate for a
Steiner tree t. Since flow rates of different requests contend for edge capacities, for each
edge e, the summation of trees’ flow rates that use edge e should not exceed the edge
capacity. This is reflected in constraint (4.3). Constraint (4.4) ensures that all transfers
will complete prior to deadlines. The flow rate xi (t) and throughput objective χ are
guaranteed to be non-negative in constraint (4.5).
Post-Processing Linear Program: However, it is possible that meeting all trans-
fer deadlines will exceed link capacities, so the linear program may not have feasible
solutions. When the linear program does not have feasible solutions, our approach is to
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 46
reject the transfer which has the lowest priority. If there are multiple requests with the
same priority, then we reject the request that needs the largest bandwidth.
4.2.3 Choose Sparse Solutions
The linear program we formulate above has a collection of feasible solutions. Since we
use multiple Steiner trees to deliver source data to destinations for each request, then
it is inevitable to split data at the source, which will add splitting overhead. Besides,
using multiple trees will also add packet reordering overhead at destinations. In order
to reduce such overhead, we prefer to use few trees for distributing data, which needs us
to choose sparse solutions from the feasible solutions. Therefore, we can add a penalty
function at the objective:
maximize χ− µn∑i=1
∑t∈T
g(xi (t)
), (4.6)
subject to the same constraints (4.2)−(4.5). And g (xi (t)) is defined as:
g(xi (t)
)=
0, if xi (t) = 0,
1, if xi (t) > 0.
Problem (4.6) is different from Problem (4.1) because we change the objective func-
tion. In order to get the optimal throughput and use less trees, µ should not be too
large or too small. Too large µ could make the solution far from optimality, too small
µ could lead to many trees selected. In our experiment settings, we let µ = 0.01 and
Problem (4.6) returns almost the same throughput value as Problem (4.1), the error is
smaller than 10e− 8, which can be ignored. We will show this in the experiment results.
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 47
Problem (4.6) is a non-convex optimization problem. Log-based heuristic is widely used
for finding a sparse solution, the basic idea is to replace g (xi (t)) by log (|xi (t) |+ δ),
where δ is a small positive threshold value that determines what is close to zero. Since
the problem is still not convex, we can linearize the penalty function which is inspired
by [29] by using a weighted l1-norm heuristic
maximize χ− µn∑i=1
∑t∈T
(W i (t) ∗ xi (t)
), (4.7)
subject to the same constraints (4.2)−(4.5). In each iteration we recalculate the weight
function W i where:
W i (t) =1
(xi (t))k + δ.
Then Problem (4.6) becomes a linear problem, and it is solved iteratively. (xi (t))k
is obtained from the kth iteration, δ is a small positive constant. We can see that if
(xi (t))k
is smaller, then the weight function W i becomes larger, xi (t) will be smaller.
Upon convergence, (xi (t))k ≈ (xi (t))
k+1= (xi (t))
∗, for i = 1, ..., n, t ∈ T, then:
W i (t) ∗(xi (t)
)∗=
(xi (t))∗
(xi (t))k + δ=
0, if (xi (t))
∗= 0,
1, if (xi (t))∗> 0.
Eventually, the transformed Problem (4.7) approaches the Problem (4.6) and yields
sparse solutions. Algorithm 4.1 presents a summary of our solution.
4.2.4 Proof of Convergence
In this section, we provide a brief proof of convergence for the Problem (4.7), which is:
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 48
Algorithm 4.1 Deadline-aware routing for multiple multicast transfers
1: Input: Transfer requests: {Si,Ri,Qi,Di, ai}; Network Topology G = (V,E,C).
2: k := 0. Initialize δ = 10−8, (W i (t))0
= 1, sparse flag = False.3: update k = k + 14: If sparse flag == False, given the solution (xi (t))
kfrom the previous iteration,
get W i (t) = 1
(xi(t))k+δ, solve the linear program (4.7) to obtain flow rates (xi (t))
k+1,
throughput optimal value χk+1 and status.5: If status == infeasible, remove the request with the lowest priority. Solve the linear
program (4.7) with updated inputs to obtain (xi (t))k+1
, χk+1 and status.
6: If status == optimal: if (xi (t))k+1 ≈ (xi (t))
k, return (xi (t))
∗ ≈ (xi (t))k+1
; else goto Step 3 for another iteration. If status == infeasible, go to Step 5.
7: Output: {xi (t)} and corresponding Steiner trees {t|t ∈ T i}.
Proposition 4.1.
maximize χ− µn∑i=1
∑t∈T i
xi (t)
(xi (t))k + δ(4.8)
subject to x = (x1 (t) , . . . , xn (t)) ∈ C, ∀ t ∈ T i, (4.9)
with δ> 0 and xi (t) ≥ 0, for i = 1, ..., n, where C ⊂ Rn is a convex, compact set. When
k →∞, we have (xi (t))k+1 − (xi (t))
k → 0, for all i, t ∈ T i.
Proof. Let Ni denotes the number of Steiner trees for request i. Since Problem (4.8)
yield (xi (t))k+1
, and our objective is to minimizen∑i=1
∑t∈T i
xi(t)
(xi(t))k+δ, thus we have that:
n∑i=1
∑t∈T i
(xi (t))k+1
+ δ
(xi (t))k + δ≤
n∑i=1
∑t∈T i
(xi (t))k
+ δ
(xi (t))k + δ=
n∑i=1
Ni. (4.10)
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 49
Using the inequality between the arithmetic and geometric means, we have:
1n∑i=1
Ni
n∑i=1
∑t∈T i
(xi (t))k+1
+ δ
(xi (t))k + δ≥
n∏i=1
∏t∈T i
((xi (t))
k+1+ δ
(xi (t))k + δ
) 1n∑
i=1Ni
.
(4.11)
If we combine Equation (4.10) and (4.11) together, we will get:
n∏i=1
∏t∈T i
((xi (t))
k+1+ δ
(xi (t))k + δ
) 1n∑
i=1Ni
≤ 1. (4.12)
We let
A((xi (t)
)k)=((xi (t)
)k+ δ) 1
n∑i=1
Ni. (4.13)
Since (xi (t))k ≥ 0 and δ > 0, thus A
((xi (t))
k)
is bounded below by δ
1n∑
i=1Ni
, then
A(
(xi (t))k)
will converge to a nonzero limit as k →∞, which implies that
limk→∞
n∏i=1
∏t∈T i
((xi (t))
k+1+ δ
(xi (t))k + δ
) 1n∑
i=1Ni
= 1. (4.14)
Now, we combine Equation (4.14) with Equation (4.10) and (4.11), as k →∞, we have:
n∑i=1
Ni ≤n∑i=1
∑t∈T i
(xi (t))k+1
+ δ
(xi (t))k + δ≤
n∑i=1
Ni, (4.15)
which equals to:
limk→∞
n∑i=1
∑t∈T i
(xi (t))k+1
+ δ
(xi (t))k + δ=
n∑i=1
Ni. (4.16)
4.2. SYSTEM MODEL AND PROBLEM FORMULATION 50
Therefore, we have(xi(t))
k+1+δ
(xi(t))k+δ= 1 when k →∞, which means that (xi (t))
k+1 ≈ (xi (t))k.
Convergence proved. ut
4.2.5 An Example of the Optimal Solution
1
54
32
Steiner Trees for Request R1Steiner Trees for Request R2
Request R1 Request R2
2 1 4
2 54
1
15
12.06
10.44
Trees TreesRate Rate
5 1 3 4.56
15
5 41
32.94
2 4 1 5 3 1
Figure 4.2: An example of the optimal solution obtained by solving the linear programin Sec. 4.2.3 for maximizing the total throughput of all requests.
An example using the inter-datacenter network is shown in Figure 4.2. To simplify the
example, we assume all link capacities are 15MB/s. Considering there are two requests
R1 and R2. R1 needs to send source data from datacenter 2 to datacenter 1 and 4; R2
needs to send source data from datacenter 5 to datacenter 1 and 3. Table 4.3 gives the
detailed requirements of these two requests. We use this example to explain the benefit of
our linear programming formulation. Our linear program tries to maximize throughput
and meet deadlines for all requests, Figure 4.2 shows the optimal solution obtained by
solving the linear program in Sec. 4.2.3. Our solution will split the source data at the
sender based on the flow rate allocated on each tree, and send the data through different
trees. We can see that both requests can meet their deadlines, R2 can even finish the
4.3. IMPLEMENTATION 51
Requests Source Destinations Volume (MB) Deadline (seconds)R1 2 1, 4 300 8R2 5 1, 3 300 18
Table 4.3: Request requirements for the example.
transfer before its deadline since the linear program aims at maximizing throughput.
If we treat each multicast transfer as multiple unicast transfers, R1 will miss its
deadline, and this approach wastes a lot of bandwidth. DDCCast [27] finds only one
minimum weight Steiner tree for each request. In our example, the largest capacity for
one tree is only 15MB/s, if we use only one tree to distribute data, the shortest time to
finish the transfer will be 20s, which still makes both R1 and R2 miss their deadlines.
Using multiple trees increases throughput for a transfer, which can make more transfers
meet deadlines.
4.3 Implementation
We have completed a real-world implementation in a software-defined overlay network
testbed at the application layer. Different from the traditional SDN techniques, our
application-layer SDN does not need to cope with the complicated lower layer properties
and management. Besides, our application-layer solution has higher switching capacities,
which can support more forwarding rules at the datapath node and scale well to a large
number of transfer requests.
Figure 4.3 shows the high-level architecture of our application-layer solution. After we
start the testbed, controller and datapath nodes will establish persistent TCP connections
between each other; we use iperf to measure bandwidth information between each node
and send it to the controller, which is an important input for making routing decisions.
4.3. IMPLEMENTATION 52
aggregator
Datapath node 1
aggregator
Datapath node 1
aggregator
Datapath node 1
ControllerMaking routing decisions
Data Data Data
Figure 4.3: Architecture of the application-layer SDN design.
We employ a local aggregator at each datapath node; this aggregator helps to aggregate
and schedule inter-datacenter flows. In our experiment, we use 6 Virtual Machines (VM)
instances located in 6 different datacenters, and one of the VMs is also launched as the
central controller.
Now we will explain how an inter-datacenter transfer is routed and completed through
the application-layer SDN testbed. After a transfer request is submitted, the relative
destination nodes will firstly subscribe to a specific channel by using a subscriber API
implemented in Java, then the source node publishes its data, destinations, deadline
requirement and priority information to the channel by using a publisher API. Source
data will be aggregated at the local aggregator, then the aggregator consults the controller
for routing rules. In the controller, our routing algorithm implemented in Python will
compute routing rules by using bandwidth input and the request’s information. Two
types of routing rules will be published to each datapath node: one is {‘NodeID’: xx,
‘NextHop’:xx, ‘SessionId’: xx} which indicates the next-hop datacenter for the current
datapath node; another is {‘NodeId’: xx, ‘Weight’: xx, ‘SessionId’: xx}, the value of
4.4. PERFORMANCE EVALUATION 53
‘Weight’ indicates the sending rate of the datapath node. After the aggregator gets
routing rules, if we need to use multiple trees for sending data, then source data will
be split at the source node. When data arrive at the aggregator of another node, the
aggregator will check the rule. If ‘NextHop’ is the node itself, then data will be delivered
successfully and written back to the disk. If ‘NextHop’ has a different node, then data
will be relayed by the aggregator to another node.
In our experiment, we generate some transfer requests in a small time interval and try
to send all of them before deadlines by using our routing algorithm. When a request is
rejected, the controller will make a new routing decision after there are available capacities
and send the decision to all datapath nodes.
4.4 Performance Evaluation
Now we are ready to evaluate the performance of our real-world implementation. In this
section, we present our experiment settings and evaluation results.
4.4.1 Experiment Setup
We have deployed our real-world implementation with the linear program routing algo-
rithm on Google Cloud Platform with six datacenters located geographically. In each
datacenter, we launch one Virtual Machine (VM). Locations of these datacenters are
shown in Figure 4.4.
In our deployment, we use all VM instances located in different datacenters as data-
path nodes, and VM instance in IOWA (US-central1-a) has been used as the controller
of our application-layer testbed. All VM instances are of type n1-standard-4, each has
4.4. PERFORMANCE EVALUATION 54
US West(Oregon)
US Central(IOWA)
US East(North Virginia)
Europe West(London)
Asia East(Taiwan)
Asia Northeast(Taiwan)
Figure 4.4: The 6 Google Cloud datacenters used in our deployment and experiments.
4 vCPUs, 15GB memory and 10GB Solid-State Drive. In each VM instance, we run
Ubuntu 14.04 LTS system. In our experiment, we aim at showing the benefit of using
multiple Steiner trees for transfers, so we use the Linux Traffic Control (TC) to make
each inter-datacenter link has uniform 120Mbps bandwidth.
We use Linux command truncate to generate input files with a fixed size for each
request. In our experiment, when a request is submitted, destinations of requests will
first launch the Java API subscriber() to subscribe the request. After that, the VM
instance with source data will launch the Java API publisher() to read blocks of the file,
each block is 4MB, and publish blocks of data to the aggregator.
4.4.2 Evaluation Methodology
Workload: We use file replication as inter-datacenter traffic. For each transfer, we
generate the source from 6 datacenters randomly and increase the number of destinations
from 1 to 5. The volume of each file is set to be 300MB. For the deadline-constrained
transfers, we choose deadlines from a uniform distribution between [T, αT] as OWAN [53],
4.4. PERFORMANCE EVALUATION 55
α represents the tightness of deadlines. When α is small, then transfers have very close
deadlines. And T is the shortest deadline of all requests, which is related to the volume
of transfer data and the number of transfers. The priority value is generated randomly
for each transfer. We run our experiments in multiple time slots, at each time slot,
six transfer requests will be generated at the beginning of the slot within a small time
interval. Then the length of each time slot is the longest deadline of requests.
Performance metrics: We measure two metrics: the inter-datacenter throughput
and the percentage of requests that meet deadlines. The inter-datacenter throughput
is obtained as the total size of all files transferred divided by the total transfer time to
finish requests, the unit is Mbps.
We compare our solution with two solutions DDCCast [27] and Amoeba [25]. DDC-
Cast finds only one tree for each request and schedule requests as late as possible. To
maximize utilization at the current time slot, it pulls some traffic to the current slot and
pushes forward other traffic close to deadlines. Since DDCCast only uses one tree for
each request, so it can not accommodate some requests with early deadlines. Amoeba
considers unicast transfers; it finds k-shortest paths for each source and destination pair.
4.4.3 Evaluation Results
Sparse solution performance: We compare the sparse solution and original linear
program without penalty in Table 4.4. From the table, we can see that the sparse
routing approach has the same optimal value as the original linear program, and it uses
much fewer trees than the original linear program.
Completion time deviation: We run the experiment in 10 time slots and get
the completion time for each transfer request. The completion time is the time from
4.4. PERFORMANCE EVALUATION 56
Requests 1 2 3 4 5 6 7 8 9 10 Optimal ValueWorkload 1 2/15 5/15 2/18 2/15 2/15 3/6 1/18 3/15 1/15 5/15 5.558/5.558Workload 2 3/18 2/18 2/18 2/11 1/11 2/18 1/11 3/18 1/11 2/18 7.901/7.901
Table 4.4: Comparison of the sparse routing approach and original linear program, theleft side represents the number of trees for sparse solution, the right side represents thenumber of trees for original linear program.
−2 0 2 4 6(Actual Completion Time - Scheduled Completion Time)/s
0.0
0.2
0.4
0.6
0.8
1.0CD
F
Figure 4.5: Completion time deviation.
destinations subscribe requests to all destinations receive the source data. In order to
show our solution performs effectively in scheduling requests with deadline constraints, we
plot the CDF figure of the difference between the actual completion time and scheduled
completion time in Figure 4.5. From the figure, we observe that 80% of requests finished
before the scheduled time, the possible reason is that we use TC to set the largest
bandwidth as 120Mbps, in our routing decision, we use the same value. However, it is
possible that sometimes the bandwidth used by flows can not reach 120Mbps. So we set
the link bandwidth a little larger than 120Mbps in later experiments, which is 130Mbps.
Early deadline requests and tightness deadline factor: Some requests may
have early deadlines, our solution has better performance for these requests. When
4.4. PERFORMANCE EVALUATION 57
1.5 2.0 2.5 3.0 3.5 4.0The tightness factor α
0.0
0.2
0.4
0.6
0.8
1.0
% o
f req
ues
s ha
t mee
t dea
dlin
es
Our solutionAmoebaDDCCast
Figure 4.6: Comparison of different solutions for early deadline requests.
a request requires bandwidth more than each link capacity in the network, then one
routing tree is not enough for the request to meet its deadline. We generate the value
of deadlines from a uniform distribution [T, αT]. In order to show the benefit of our
solution for requests with early deadlines, we let T = 10s and increase α from 1.2 to 4
to see the effect of the tightness factor.
Figure 4.6 presents the comparison for different solutions, and the number of desti-
nations is 2. The x-axis is the tightness factor α; the y-axis represents the percentage of
requests that meet deadlines. We can see that when the deadline ranges from 10s to 20s,
DDCCast can not accommodate such request because the largest capacity for one tree is
120Mbps. As α increases, more requests can meet their deadlines because the range of
deadlines becomes larger. Amoeba achieves lower percentage of requests that meet their
deadlines since the unicast way uses more bandwidth for each transfer than our solution.
DDCCast performs worse than Amoeba because it can not accommodate transfers that
have deadlines earlier than 20s. The comparison result shows that our solution admits
4.4. PERFORMANCE EVALUATION 58
1 2 3 4 5The number of destinations
0.5
0.6
0.7
0.8
0.9
1.0
% o
f req
uest
s tha
t mee
t dea
dlin
es
Our solutionAmoebaDDCCast
Figure 4.7: Comparison of different solutions as the number of destinations increases.
more early deadline requests than DDCCast and Amoeba.
Effect of the number of destinations: We increase the number of destinations
from 1 to 5, and we set α = 2, T = 20s. Figure 4.7 shows the percentage of requests
that meet their deadlines as the number of destinations increases. As a consequence,
our solution admits more transfers than the other two solutions. When the number of
destinations increases, Amoeba does not have enough bandwidth to allocate for all source
and destination pairs. DDCCast finds a minimum weight tree for each transfer, when
the number of destinations increases, some transfers may not have space to be scheduled.
Throughput: To demonstrate the throughput improvement of our solution, we plot
the throughput performance in Figure 4.8. The average throughput is calculated as the
total file size of all requests that meet their deadlines divided by the total transfer time.
We only consider the throughput for requests that meet deadlines. Our solution has the
maximum utilization of network bandwidth and admits more transfers than the other
two solutions, so the throughput is also the highest. We can see that, when the number
4.4. PERFORMANCE EVALUATION 59
1 2 3 4 5The number of destinations
200
400
600
800
1000
1200
1400
The average throug
put/M
bps
Our solutionAmoebaDDCCast
Figure 4.8: Throughput comparison of different solutions.
of destinations is 1 or 2, Amoeba has higher throughput than DDCCast. The possible
reason is DDCCast always tries to push some transfers close to deadlines, which can
make the transfer time longer than Amoeba.
Scalability: To show the scalability of our linear program with sparse solutions, we
record the running time of the LP with different number of input variables, which is
shown in Figure 4.9. The running time is the average time of multiple runs. When the
number of input variables is 900, the running time is less than 1.75s, which is acceptable
when compared with the transfer time of requests. The result proves that our solution
is efficient and converges very fast. Moreover, the number of datacenters in practice is
always small. Thus our solution is scalable.
In a nutshell, from the evaluation results, our solution maximizes the network through-
put and admits more transfers than DDCCast and Amoeba. Compared with DDCCast,
our solution can admit some requests that have early deadlines, which demand more
bandwidth than each link capacity.
4.5. SUMMARY 60
36 54 90 180 360 540 900The number of variables
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
Runn
ing tim
e/s
Figure 4.9: The computation time of our approach.
4.5 Summary
4.5.1 Discussion
We now give some discussions about our future work.
Dynamic resources: Our work assumes network resources are stable as previous
related works. However, the network resources can change dynamically over the time. In
the future work, we may consider dynamic resources in making routing decisions. The
controller will measure bandwidth information at each time slot and repack the remaining
requests under the current network resources.
Different request arrival rates: Our work does not explore the effect of request
arrival rate. Since our objective is to maximize throughput and accommodate a maximal
number of transfers with deadline requirements, we assume requests arrive in a small
time interval at the beginning of each time slot. The results show that our solution has a
good performance in routing requests that arrive closely. In the future work, we may add
4.5. SUMMARY 61
the time dimension to our formulation and explore the effect of different request arrival
rates.
4.5.2 Conclusion
In this chapter, we design a smart and efficient solution for multicast inter-datacenter
transfers, which aims at maximizing the network throughput and meeting transfer dead-
lines as many as possible. Traditionally, initializing the multicast transfer as multiple
independent unicast transfers can waste more bandwidth and let some other requests
miss their deadlines. Thus we propose to use multiple Steiner trees for each multicast
transfer. We formulate the problem as a linear program (LP) and find sparse solutions
by using a weighted l1-norm heuristic. To prove the practicality and efficiency of our
solution, we have implemented our idea in a software-defined overlay network testbed at
the application layer. Google Cloud Platform is used for real-world experiments with 6
Virtual Machine instances in 6 different datacenters. Experiment results show that our
design performs better in maximizing throughput and meeting transfer deadlines than
related existing works.
Chapter 5
Conclusion
Our focus in this thesis is to study the problem of resource optimization across geo-
graphically distributed datacenters. For cloud providers, it is essential to meet customer
requirements in time, guarantee quality of service, and reduce resource wastage. Since
different users have various resource requirements, resource optimization algorithms used
by cloud providers have a significant impact on the performance of virtual machines that
users rent for computation as well as on the ability of datacenters to accommodate user
requests. In this thesis, we propose different approaches for resource optimization in
various aspects.
Cloud computing provides a large pool of resources for users to store and process
their data. Users require the allocation of virtual machines (VMs) in datacenters to meet
their computational needs. We first study the multi-dimensional VM placement problem.
Because users always utilize less resources than reserved capacities, resource overcommit-
ment is incorporated in most cloud products to reduce resource wastage. Existing works
only consider maximizing resource utilization of PMs and minimizing the number of used
PMs to save energy, which can result in the risk of PM overloading and degrading VM
62
CHAPTER 5. CONCLUSION 63
performance. To solve this problem, we propose a threshold-based algorithm Min-DIFF
which can achieve a balanced use of resources along different dimensions and reduce the
risk of PM overloading. Extensive simulation results have shown that our algorithm
achieves better performance and accommodate more requests than related works.
As the volume of data grows, storing such data within the same datacenter is no
longer feasible, and they naturally need to be distributed across multiple datacenters.
This is further motivated by the fact that the data to be processed, such as user activity
logs, are generated in a geographically distributed fashion. Thus it is more efficient to
store the data where they are generated, which results in deploying cloud computing
resources over many datacenters in a wide area network. Because of the geographically
distributed characteristic, many applications need to process data across different data-
centers. Bandwidth between different datacenters is costly and scarce. When multiple
transfers share the same inter-datacenter link, it is challenging to do resource allocation
for these transfers and meet their requirements.
Considering that most inter-datacenter transfers need to be completed before dead-
lines, it is an important topic to meet a maximal number of deadlines. We propose to
use multiple Steiner trees for each inter-datacenter multicast transfer and formulate the
problem as a linear program with the objective of maximizing throughput for all transfer
requests with the consideration of meeting deadlines. Through experiments on Google
Cloud Platform, we have shown that our deadline-aware solution has a higher network
throughput and a lower rejection rate compared to related works.
Bibliography
[1] R. P. Goldberg, “Survey of Virtual Machine Research,” Computer, vol. 7, no. 6, pp.
34–45, 1974.
[2] X. Zhang, Z.-Y. Shae, S. Zheng, and H. Jamjoom, “Virtual Machine Migration in
an Over-Committed Cloud,” in Proc. IEEE Network Operations and Management
Symposium (NOMS), 2012.
[3] M. Dabbagh, B. Hamdaoui, M. Guizani, and A. Rayes, “Efficient Datacenter Re-
source Utilization Through Cloud Resource Overcommitment,” in Proc. IEEE Con-
ference on Computer Communications Workshops (INFOCOM WKSHPS), 2015.
[4] L. Tomas and J. Tordsson, “Improving Cloud Infrastructure Utilization Through
Overbooking,” in Proc. ACM Cloud and Autonomic Computing conference, 2013.
[5] I. Banerjee, F. Guo, K. Tati, and R. Venkatasubramanian, “Memory Overcommit-
ment in the ESX Server,” VMware Technical Journal, vol. 2, no. 1, pp. 2–12, 2013.
[6] M. A. H. Monil and A. D. Malony, “QoS-Aware Virtual Machine Consolidation in
Cloud Datacenter,” in Proc. IEEE International Conference on Cloud Engineering
(IC2E), 2017.
64
BIBLIOGRAPHY 65
[7] M. Wang, X. Meng, and L. Zhang, “Consolidating Virtual Machines with Dynamic
Bandwidth Demand in Data Centers,” in Proc. IEEE INFOCOM, 2011.
[8] X. Zhang, Y. Zhao, S. Guo, and Y. Li, “Performance-Aware Energy-efficient Virtual
Machine Placement in Cloud Data Center,” in Proc. IEEE International Conference
on Communications (ICC), 2017.
[9] H. Yanagisawa, T. Osogami, and R. Raymond, “Dependable Virtual Machine Allo-
cation,” in Proc. IEEE INFOCOM, 2013.
[10] S. Rampersaud and D. Grosu, “A Multi-Resource Sharing-Aware Approximation
Algorithm for Virtual Machine Maximization,” in Proc. IEEE International Con-
ference on Cloud Engineering (IC2E), 2015.
[11] D. Jayasinghe, C. Pu, T. Eilam, M. Steinder, I. Whally, and E. Snible, “Improv-
ing Performance and Availability of Services Hosted on Iaas Clouds with Structural
Constraint-Aware Virtual Machine Placement,” in Proc. IEEE International Con-
ference on Services Computing (SCC), 2011.
[12] F. Machida, M. Kawato, and Y. Maeno, “Redundant Virtual Machine Placement
for Fault-Tolerant Consolidated Server Clusters,” in Proc. IEEE Network Operations
and Management Symposium (NOMS), 2010.
[13] C. C. T. Mark, D. Niyato, and T. Chen-Khong, “Evolutionary Optimal Virtual Ma-
chine Placement and Demand Forecaster for Cloud Computing,” in Proc. IEEE
International Conference on Advanced Information Networking and Applications
(AINA), 2011.
BIBLIOGRAPHY 66
[14] A. Beloglazov, J. Abawajy, and R. Buyya, “Energy-Aware Resource Allocation
Heuristics for Efficient Management of Data Centers for Cloud Computing,” Fu-
ture generation computer systems, vol. 28, no. 5, pp. 755–768, 2012.
[15] N. T. Hieu, M. Di Francesco, and A. Y. Jaaski, “A Virtual Machine Placement
Algorithm for Balanced Resource Utilization in Cloud Data Centers,” in Proc. IEEE
International Conference on Cloud Computing (CLOUD), 2014.
[16] X. Li, Z. Qian, S. Lu, and J. Wu, “Energy Efficient Virtual Machine Placement
Algorithm with Balanced and Improved Resource Utilization in a Data Center,”
Mathematical and Computer Modelling, vol. 58, no. 5, pp. 1222–1235, 2013.
[17] M. Noormohammadpour, C. S. Raghavendra, and S. Rao, “DCRoute: Speeding up
Inter-Datacenter Traffic Allocation While Guaranteeing Deadlines,” in Proc. IEEE
International Conference on High Performance Computing (HiPC), 2016.
[18] C.-Y. Hong, S. Kandula, R. Mahajan, M. Zhang, V. Gill, M. Nanduri, and R. Wat-
tenhofer, “Achieving High Utilization with Software-Driven WAN,” in Proc. ACM
SIGCOMM, 2013.
[19] V. Jalaparti, I. Bliznets, S. Kandula, B. Lucier, and I. Menache, “Dynamic Pric-
ing and Traffic Engineering for Timely Inter-Datacenter Transfers,” in Proc. ACM
SIGCOMM, 2016.
[20] K. Jain, M. Mahdian, and M. R. Salavatipour, “Packing Steiner Trees,” in
Proc. ACM-SIAM symposium on Discrete algorithms, 2003.
[21] Y. Wu, P. A. Chou, and K. Jain, “A Comparison of Network Coding and Tree
Packing,” in Proc. International Symposium on Information Theory (ISIT), 2004.
BIBLIOGRAPHY 67
[22] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh, S. Venkata, J. Wan-
derer, J. Zhou, M. Zhu et al., “B4: Experience with a Globally-Deployed Software
Defined WAN,” in Proc. ACM SIGCOMM, 2013.
[23] A. Kumar, S. Jain, U. Naik, A. Raghuraman, N. Kasinadhuni, E. C. Zermeno, C. S.
Gunn, J. Ai, B. Carlin, M. Amarandei-Stavila et al., “BwE: Flexible, Hierarchical
Bandwidth Allocation for WAN Distributed Computing,” in Proc. ACM SIGCOMM,
2015.
[24] S. Kandula, I. Menache, R. Schwartz, and S. R. Babbula, “Calendaring for Wide
Area Networks,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 4,
pp. 515–526, 2015.
[25] H. Zhang, K. Chen, W. Bai, D. Han, C. Tian, H. Wang, H. Guan, and M. Zhang,
“Guaranteeing Deadlines for Inter-Data Center Transfers,” IEEE/ACM Transac-
tions on Networking, 2016.
[26] M. Noormohammadpour, C. S. Raghavendra, S. Rao, and S. Kandula, “DCCast: Ef-
ficient Point to Multipoint Transfers Across Datacenters,” in Proc. USENIX Work-
shop on Hot Topics in Cloud Computing (HotCloud), 2017.
[27] M. Noormohammadpour and C. S. Raghavendra, “DDCCast: Meeting Point to
Multipoint Transfer Deadlines Across Datacenters Using ALAP Scheduling Policy,”
arXiv preprint arXiv:1707.02027, 2017.
[28] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge university press,
2004.
BIBLIOGRAPHY 68
[29] M. S. Lobo, M. Fazel, and S. Boyd, “Portfolio Optimization with Linear and Fixed
Transaction Costs,” Annals of Operations Research, vol. 152, no. 1, pp. 341–365,
2007.
[30] Y. Kanizo, D. Hay, and I. Keslassy, “Palette: Distributing Tables in Software-
Defined Networks,” in Proc. IEEE INFOCOM, 2013.
[31] Yu, Minlan and Rexford, Jennifer and Freedman, Michael J and Wang, Jia, “Scalable
Flow-Based Networking with DIFANE,” ACM SIGCOMM Computer Communica-
tion Review, vol. 40, no. 4, pp. 351–362, 2010.
[32] B. Leng, L. Huang, X. Wang, H. Xu, and Y. Zhang, “A Mechanism for Reducing
Flow Tables in Software Defined Network,” in Proc. IEEE International Conference
on Communications (ICC), 2015.
[33] L.-H. Huang, H.-J. Hung, C.-C. Lin, and D.-N. Yang, “Scalable and Bandwidth-
Efficient Multicast for Software-Defined Networks,” in Proc. IEEE Global Commu-
nications Conference (GLOBECOM), 2014.
[34] S.-H. Shen, L.-H. Huang, D.-N. Yang, and W.-T. Chen, “Reliable Multicast Routing
for Software-Defined Networks,” in Proc. IEEE INFOCOM, 2015.
[35] X. Li, Z. Qian, R. Chi, B. Zhang, and S. Lu, “Balancing Resource Utilization for
Continuous Virtual Machine Requests in Clouds,” in Proc. IEEE International Con-
ference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS),
2012.
[36] J. Xu and J. A. Fortes, “Multi-Objective Virtual Machine Placement in Virtual-
ized Data Center Environments,” in Proc. IEEE/ACM International Conference
BIBLIOGRAPHY 69
on Green Computing and Communications & International Conference on Cyber,
Physical and Social Computing, 2010.
[37] A. C. Adamuthe, R. M. Pandharpatte, and G. T. Thampi, “Multiobjective Virtual
Machine Placement in Cloud Environment,” in Proc. IEEE International Conference
on Cloud & Ubiquitous Computing & Emerging Technologies (CUBE), 2013.
[38] F. L. Pires and B. Baran, “Multi-Objective Virtual Machine Placement with Service
Level Agreement: A Memetic Algorithm Approach,” in Proc. IEEE/ACM Interna-
tional Conference on Utility and Cloud Computing, 2013.
[39] S. Chaisiri, B.-S. Lee, and D. Niyato, “Optimal Virtual Machine Placement Across
Multiple Cloud Providers,” in Proc. IEEE Asia-Pacific Services Computing Confer-
ence, 2009.
[40] M. Sun, W. Gu, X. Zhang, H. Shi, and W. Zhang, “A Matrix Transformation Algo-
rithm for Virtual Machine Placement in Cloud,” in Proc. IEEE International Con-
ference on Trust, Security and Privacy in Computing and Communications (Trust-
Com), 2013.
[41] F. Hao, M. Kodialam, T. V. Lakshman, and S. Mukherjee, “Online Allocation of
Virtual Machines in a Distributed Cloud,” in Proc. IEEE INFOCOM, 2014.
[42] J. Dong, X. Jin, H. Wang, Y. Li, P. Zhang, and S. Cheng, “Energy-Saving Virtual
Machine Placement in Cloud Data Centers,” in Proc. IEEE/ACM International
Symposium on Cluster, Cloud and Grid Computing (CCGrid), 2013.
BIBLIOGRAPHY 70
[43] M. Mishra and A. Sahoo, “On Theory of VM Placement: Anomalies in Existing
Methodologies and Their Mitigation Using a Novel Vector Based Approach,” in
Proc. IEEE International Conference on Cloud Computing (CLOUD), 2011.
[44] M. Alicherry and T. Lakshman, “Optimizing Data Access Latencies in Cloud Sys-
tems by Intelligent Virtual Machine Placement,” in Proc. IEEE INFOCOM, 2013.
[45] X. Chen, M. Chen, B. Li, Y. Zhao, Y. Wu, and J. Li, “Celerity: A Low-Delay Multi-
Party Conferencing Solution,” in Proc. ACM international conference on Multime-
dia, 2011.
[46] Y. Feng, B. Li, and B. Li, “Airlift: Video Conferencing as a Cloud Service Using
Inter-Datacenter Networks,” in Proc. IEEE International Conference on Network
Protocols (ICNP), 2012.
[47] Y. Liu, D. Niu, and B. Li, “Delay-Optimized Video Traffic Routing in Software-
Defined Interdatacenter Networks,” IEEE Transactions on Multimedia, vol. 18,
no. 5, pp. 865–878, 2016.
[48] N. Laoutaris, M. Sirivianos, X. Yang, and P. Rodriguez, “Inter-Datacenter Bulk
Transfers with Netstitcher,” in Proc. ACM SIGCOMM, 2011.
[49] N. Laoutaris, G. Smaragdakis, R. Stanojevic, P. Rodriguez, and R. Sundaram,
“Delay-Tolerant Bulk Data Transfers on the Internet,” IEEE/ACM Transactions
on Networking (TON), vol. 21, no. 6, pp. 1852–1865, 2013.
[50] B. B. Chen and P. V.-B. Primet, “Scheduling Deadline-Constrained Bulk Data
Transfers to Minimize Network Congestion,” in Proc. IEEE International Sympo-
sium on Cluster Computing and the Grid (CCGRID), 2007.
BIBLIOGRAPHY 71
[51] Y. Wang, S. Su, A. X. Liu, and Z. Zhang, “Multiple Bulk Data Transfers Scheduling
Among Datacenters,” Computer Networks, vol. 68, pp. 123–137, 2014.
[52] Y. Wu, Z. Zhang, C. Wu, C. Guo, Z. Li, and F. C. Lau, “Orchestrating Bulk
Data Transfers Across Geo-Distributed Datacenters,” IEEE Transactions on Cloud
Computing, vol. 5, no. 1, pp. 112–125, 2015.
[53] X. Jin, Y. Li, D. Wei, S. Li, J. Gao, L. Xu, G. Li, W. Xu, and J. Rexford, “Optimizing
Bulk Transfers with Software-Defined Optical WAN,” in Proc. ACM SIGCOMM,
2016.
[54] Personal communication with HUAWEI company.
[55] Y. Gao, H. Guan, Z. Qi, Y. Hou, and L. Liu, “A Multi-Objective Ant Colony
System Algorithm for Virtual Machine Placement in Cloud Computing,” Journal of
Computer and System Sciences, vol. 79, no. 8, pp. 1230–1242, 2013.
[56] F. Ma, F. Liu, and Z. Liu, “Multi-Objective Optimization for Initial Virtual Machine
Placement in Cloud Data center,” Journal of Information &Computational Science,
vol. 9, no. 16, pp. 5029–5038, 2012.
[57] “Bitbrain Workload Traces,” http://gwa.ewi.tudelft.nl/datasets/gwa-t-12-bitbrains.
[58] C. Reiss, A. Tumanov, G. R. Ganger, R. H. Katz, and M. A. Kozuch, “Hetero-
geneity and Dynamicity of Clouds at Scale: Google Trace Analysis,” in Proc. ACM
Symposium on Cloud Computing, 2012.