evaluation of energy consumption in virtualization ... · 2.4 cloud services, virtualization, and...

Evaluation of energy consumption invirtualization environmentsProof of concept using containers

Jonathan Westin

Jonathan WestinVT 2017Bachelor’s thesis, 15 ECTSSupervisor: P-O OstbergExaminer: Pedher JohanssonBachelor Of Science Programme in Computing Science, 180 ECTS

Abstract

The demand for cloud services offering virtualization increases with acontinual interest for different types of applications. Regardless of theresource demand of the application, some supplier is billing by the timeof usage, making it an unfair pricing for the clients.

In this thesis, the virtualization characterization and properties makeroom for another form of payment qualities. Since power consumptionis comprehensible and an understandable measurement for both partiesit is investigated if there are any useful ways of measuring energy con-sumption of an application.

The thesis creates a proof of concept of predicting energy consump-tion from CPU usage from a containerized application. With an empir-ical simulation and the use of linear regression, the results are feasibleof estimating the energy consumption from the simulation data with a0.813 Watt deviation. Even though the proof of concept is a harsh sim-plification of the dilemma for predicting energy consumption from anapplication, the thesis embraces the issue of energy consumption withindata centers.

Contents

1 Introduction 1

1.1 Problem statement 2

2 Background 2

2.1 Clouds and Clusters 2

2.2 Wikimedia, a distributed example 3

2.3 Virtualization 3

2.4 Cloud services, virtualization, and energy consumption 4

3 Related work 5

3.1 Cluster resource management related to energy consumption 5

3.2 Containers versus VMs, energy, and performance 6

3.3 Pricing the Cloud 6

4 Method 6

4.1 General method approach 7

4.2 Proof of concept 7

5 Result 12

6 Discussion 14

6.1 Proof of concept 14

6.2 Related work 15

6.3 Predicting energy consumption, pricing, and future work 15

References 16

A Parsed raw data 19

B CPU usage, Joule, and Watt 21

C Result figures, full page width size 22

1(24)

1 Introduction

Cloud services have become one of the fastest growing on the service market yet the cloudservice of renting infrastructure for customers to run an application without their own datacenter is not without complications. The customers and providers share a common goal;the best performance for the lowest financial cost. From the perspective of the provider,the integral profit comes from balancing the computers in the cluster to minimize the idleperformance while from the customers perspective it is interpreted as buying the necessaryperformance needed for their application for the lowest cost. In other words, the customersonly want to pay for what they use, and providers want to rent out the cluster as effective aspossible. This unfolds to one possible outcome customers sharing computers in the cloudin an effective manner.

Subsequently, it is essential for both parts to be able to monitor telemetry metrics as thedefinition for best performance does not translate the same for both parties. Providers needto be able to efficient manage the cluster, and the customers must be able to inspect theirapplications performance. Otherwise, the adjustment essential to make their own strategicand financial adjustments will be unredeemed for their applications.

Additionally, another obstacle occurs in the suggestion of sharing computers. The cus-tomers do not want other customers to have access to their application, but on the otherhand, the provider wants to maintain the profit of having customers to share as much of theaccessible assets as possible. This introduces the need of being able to isolate customersfrom each other for integrity and security aspect when for instance sharing the same physi-cal computer.

Virtualization, fortunate resolve the solution of isolation within a computer. It makes itpossible to receive the telemetry, security, integrity and isolation needed as previously men-tioned while sharing a computer. The norm is currently being carried out using VirtualMachines, creating an operating system within the operating system running on the com-puter. Another method of virtualization is by containerization. In recent years the latterapproach has gain popularity without limitations and have several advantages over VirtualMachines. Although the usage of the containerization is new, the theory is not. The twokind of virtualization is explained further in Section 2.3.

The common way for cloud providers is to set pricing is in allocating pure hardware. Mak-ing it near impossible to close the disparity between economic beneficial for both parties,customers want the possibility to extend and shrink the usage of performance and only payfor used performance. Meanwhile, the providers want to be able to use the cluster as bene-ficial in both performance and financially beneficial.

The general cost for cloud providers is the essential for any data center the hardware, loca-tion, and internet access. These are substantially basic and predictable costs, on the otherhand, the price of energy is furthermore uncertain. The energy is in direct affiliated to thenumber of computers, or nodes, running within the data center. To regulate it is not anunfamiliar thought to turn off the nodes that are idle; this would on the other hand omittingthe problem of acquiring upscale hardware that is unused.

2(24)

In the ideal world, the customers would not pay for anything not used and cloud providerswould use the full extent of their cluster. Hence for a non-profit cloud provider, the pricingwould be calculated to cover the anticipated cost with the addition of energy consumptionfrom each customer.

To estimate energy consumption of an application (used in a node shared with other ap-plications in a virtualized environment) the possibility to predict energy consumption byexclusively looking at the metrics provided the virtualization software for the application isneeded. As described in Section 1.1, this thesis will look at the mentioned predicament.

1.1 Problem statement

This thesis will make a proof of concept by only using telemetry metrics from a containeriz-ing software platform to estimate energy consumption of an application. From the gathereddata make a linear regression analysis to estimate the energy consumption of total cpu us-age of the application. The result from the proof of concept is then used to discuss thepossibility of pricing based on energy consumption.

2 Background

The idea behind distributed systems is simple, the creation of a computer with the perfor-mance condition to take care of a large commission without struggle is not a budget wiseeffort. Thus dividing the computing between a larger set of modest computers is more eco-nomical. Other aspects also play an integral role in the demand for distribution includingnetwork communication is often limited to the local ISP. By distribution can performancebe increased for excessive network applications (e.g., World Wide Web). Distribution doesnot come without complications; it produces other problems within software engineeringincluding the need for transparency and scalability when creating an application.

In this section will introduce current state of the cloud services, explain virtualization, anddistribution with a real example. The section will also include current research of energyconsumption from the point of view of the whole data centre down to the energy consump-tion of an application.

2.1 Clouds and Clusters

Cloud services such as computing and cloud storage is a growing industry, an estimated sizeof the clouds in 2012 exceeds more than 1 exabyte [17], the cloud-size keeps increasing. Theadvantages of clouds are the possibility of not having a physical computer, the maintenanceof updating or the need of upgrading both the underlying software and hardware is removedfrom the user.

3(24)

2.2 Wikimedia, a distributed example

The Wikimedia Foundation mostly recognized for the website Wikipedia. The website isrecorded as one of the top viewed website[1] on the Internet. Wikimedia monthly count15 b w.p req. [36][33], the subset of the English Wikipedia website is responsible for halfof the request [34] has on average 5 443 961 web requests per hour (average 1512,2 persecond) with a mean response of 0.958 seconds[2].

The incoming traffic is not intelligibly done by one single server nor is it convenient fornetwork traffic to be to the same geographical location when the users are worldwide, dis-tribution is key for survival in such cases. Not only is the distribution needed in differentgeographical places but also within the same server and cluster. A typical setup for a hightraffic web application is a load balancer[20], HTTP caching servers [22], web applicationservers (e.g. Apache/Nginx) and database, a simple case of local distribution is presentedin Figure 1. This is a minimum set up for a high traffic website. Wikimedia consist of over350 servers to manage the incoming load of requests whereas Figure 1 is only mentioningsome parts, wikimedia also include servers for caching, log collectors and so on. [35]

Figure 1: Simple prototype of a small distribution of a web application.

2.3 Virtualization

At the current time, the general practice of virtualization in the cloud is achievable by givingclients their separate virtual machine. A virtual machine makes it possible to run anotheroperating system on the operating system, this is beneficial when isolation is needed andother beneficial attributes are included such as recovery and backup are straightforwardsince the virtual machine can save the state of the machine within a snapshot. This is bene-ficial since it provides the possibility of moving the virtual machine to a new node within acluster. This gives both the security and integrity desired for the client. However, a virtualmachine costs a lot of overhead both in disk space and memory.

Another kind of virtualization is containers, also mentioned as containerization, makes itpossible to run software without the dependencies of hardware. The theory behind con-tainerization is old but have recently become popular but is due to later innovations anddevelopment made it essentially possible. This makes it possible to run applications ondifferent operating systems and disregard the operating system dependencies. Using con-tainers have the same advantages mentioned for virtual machines, isolation, and integrityfrom the server. Another advantage is the scalability of containers, a containerized appli-cation can easily be replicated or distributed autonomous instantaneous. Would one nodeof the cluster become unavailable the containers can easily and fast be moved to anothernode.[31, 11]

There exists different container software platforms (engines), Linux Containers[7], Docker[13]

4(24)

and Kubernetes[4]. They make it possible for the user to initialize their application in a con-tainer. All available to a large degree open platforms and can be used without charge.

Figure 2: Comparison between a containerization and hypervisor of a virtual machine,showing the overhead created by virtual machines.

Whilst using virtual machines still is the traditional way of virtualization, running applica-tions in virtual machines do make the application OS dependent of the virtual OS. Container,on the other hand, removes the dependency obstacle likewise as seen in Figure 2 a containersoftware platform (container engine) removes the overhead generated from the virtual ma-chine for each client. This makes cloud containerization is not as storage demanding asvirtual machines. [28, 15]

2.4 Cloud services, virtualization, and energy consumption

The cloud is divided into three kinds of services.

• Software as a Service (SaaS)

• Platform as a Service (PaaS)

• Infrastructure as a Service (IaaS)

SaaS is the commonly known part of the cloud, where a thin-client model is used, e.g.Google Apps, Facebook, Twitter, Dropbox is within the definition of SaaS. PaaS is a lowerlevel of computing, providing abstractions from the servers and deliver an environmentoften used for development. PaaS providers include for example Google App Engine, RedHat’s OpenShift. IaaS is the building blocks for cloud services. The IaaS providers make itpossible for other to deploy their applications to run on the cloud. The latter often offers theservices using virtualization software to provide the security and integrity measures needed.[6]

IaaS is estimated to have grown the most in proportion to the global public cloud consump-tion from 31.9% to 38.4% between 2015 and 2016 [9]. This increasing interest makes thedemands troublesome, neglecting the fundamental conventional expenditure including suchas physical, infrastructure and maintenance. The cost can be summed up towards energyconsumption.

5(24)

The combination of IaaS and virtualization make scaling possible. The elasticity is men-tioned as horizontal or vertical. The horizontal elasticity is balanced by increasing anddecreasing the number of virtualizations in an instance versus the vertical elasticity wherethe number of resources allocated to the virtualization is modified. [25, p. 1]

Current cloud services for IaaS take pricing in metrics such as time or number of con-tainers1 [26][5], this can in some cases seem arbitrary since the actual cost for the providertranslates in the mentioned fundamental cost and energy consumption. This is controversialsince the client with a simple application with no, or small demand of computation willhave the same pricing basis as an application making heavy computing.

3 Related work

To look at how clouds and energy consumption is managed, this section will review the re-cent advancement within the area of energy versus quality of service (QoS) and performancefrom the point of view of whole data centre down to visualization and applications.

3.1 Cluster resource management related to energy consumption

[14, p.1-3] mentions the complexity of energy consumption in a data center. The authorpoints out several different courses of actions to improve power efficiency. It is mentionedthat servers achieve the best power efficiency when the level of utilization is high and thatCPUs are not linear in power consumption. Creating the peak efficiency for CPUs binarysince they are only efficient on high utilization or when powered off.

The author continues to demonstrate three power efficient way to operate the data centerservers. The server consolidation approach where the number of applications matches theservers used to run at a high utilization level. Server throttling using by example DVFS2 orCPU pinning. This is however mentioned as a complex and not optimal approach. The lastapproach is the power budgeting where the premises is to control the power usage for thedata center to minimize the expenditures of capital and operational throughout the lifetimeof the data center. [14, p.11-14]

In [30, p.205-214] the authors consider the relationship of performance, power and differentconfigurations in cloud computing in regard of horizontal and vertical elasticity, the num-ber of cores and CPU frequency. In their experiment consist of a video-encoding scenario.They conclude it is feasible with a presented feedback controller to determine an optimalconfiguration towards minimize energy consumption and meet the performance goal with a34% energy saving in comparison to constituent approaches mentioned.

1A free limited usage in time, computing or storage space is often included2Dynamic voltage frequency scaling

6(24)

3.2 Containers versus VMs, energy, and performance

[19] makes evaluations by benchmark tools Bonnie++, psutil and performancetool to testthe performance of Docker and the authors claim Docker can be compared to running on anOS without virtualization.

In [24] extensive empirical experiment was made executing the different applications Word-press, Redis, and PostgreSQL. To collect energy usage was RAPL and WATTS UP? PROused, the WATTS UP? PRO collects metrics between the power supply and the physicalcomputer. They conclude that the difference between executing the applications proved thatDocker consumes more power than only driving the application directly on the computer.They measured that dockerd3 consume 2 Watt when dockerd idle.

In [18] an empirical comparison between virtual machines (KVM[3] and Xen[10]) and con-tainers (LXC[7], Docker[13]) using an external power measurement device (Raritan 22).They conclude that both hypervisors and containers create an overhead in energy consump-tion, it is also pointed out that the hypervisors consume in most cases more power in theirnetwork performance. This is also shown in [23] where it is mentioned that a Virtual Ma-chine can use up to 40% more energy for network communication and take up to 5 timesthe number of cycles to deliver a packet compared to bare-metal. Whilst the containervirtualization compared was close to identical of the bare-metal.

3.3 Pricing the Cloud

[32] make a comparison between cloud providers (Amazon Web Services, Google AppEngine, and Windows Azure) and pricing to understand the implications of the paradigmof distributed systems and economics. The authors provide tests with different benchmarkstowards choosing the best cloud provider in regard to the different usage of computing intheir experiment. The benchmark difference represents applications with a different mainobject, the benchmark differed between I/O, high-performance computing, large-scale dataprocessing and storage archival. Their conclusion is not inconclusive, but while runningthe experiment some bugs occurred, resulting in some test taking more time than initiallyprepared. This points out troublesome facts concerning the relation between the underlyinginfrastructure when pricing in time.

[32] mentions that if both the providers and users will look at the pricing to optimize thefinancial value for the service, this is a direct indication that pricing has a large value towardscreating an energy efficient system.

4 Method

To calculate the energy consumption for a containerized application by only using metricsfrom the container engine a test bench needs to be constructed. To produce a feasible testbench require a numerous of empirical experiments must be performed and analyzed. InSection 4.1 a general approach is defined and in Section 4.2 is a subset of the general ap-proach to creating a proof of concept. The results of 4.2 will be in Section 5.

3The Docker daemon

7(24)

4.1 General method approach

An external power measurement tool should be included to be able to determine the com-puter overall energy consumption. The number and brand of sensors included to the testbench are limited to the container engines metric output. It is not probable to collect met-rics that can not be used in calculations. An internal power measurement sensor is needed tocollect CPU and memory power measurement and other hardware devices, the work beingexecuted by the respective device will be mentioned as variables below.

Initially, the experiment should measure the computer without the container software plat-form should be collected. The experiment should also include a measurement with thecontainer software platform. Also, a measurement the test bench application running. Thiswill give the minimum energy consumption used by the computer when executing the testbench. These three values will give a possibility to calculate the difference between runningonly the operating system, running the operating system and container software platformand finally the operating system with the test bench containerized.

The next step should include the use of single variables, in an instance without perform-ing work on other variables from the tool point of view. The variable should increase itsvalue with a reliable increment of the variables workload that will exist in and includingidle and full performance. Each test must be repeated numerous times for validation.

At this state, it should be possible to make a regression analysis to create a function toprovide energy consumption in regard to the device performance. A standard aberrationshould also be attainable.

The following benchmark tests should increment the number of variables tested togetherand make every permutation of the experiment. This should be iterated until all variablesare tested to create a complete graph of all the experiment. From each iteration, a multivari-able regression analysis is calculated.

4.2 Proof of concept

The following section will describe the proof of concept and the underlying parts are set up.This report will measure the metrics, given by the container software platform, in this case,Docker to be able to calculate a regression analysis. It will only look at the benchmark ofhow CPU utilization is measured against energy consumption.

Simulation setup

The simulation tool (furthermore also test bench) was run within a containerization on avirtual machine on a computer as seen in Figure 3.

8(24)

Figure 3: The simulation setup consists of a computer with a containerized Simulator anda program created to overview the handling of simulation called SimulationSupervisor.

Computer

The operating system of the computers consist of Linux Debian 3.16.43-2 x86 64 with anIntel(R) Core(TM) i7-4770 CPU @ 3.40GHz processor 4 cores and 8 threads with 32GBRAM,

Virtual machine

The virtual machine operates with the hypervisor VirtualBox [21] and emulates Linux 4.8.0-52-generic x86 64 with one core and 8GB RAM.

Container environment

The container engine consists of Docker 17.03.1-ce with default settings except activationof the REST-API[12].

Collection tools

For the collection of metrics the open source framework Snap telemetry or snaptel for short.The framework is divided into three parts, collect, process and publish. Each part has aplugin to collect, process or publish to or from a different software metric. The collectionpart gathers metrics from different software. The process part takes all the collected dataand transforms or modify the data so it can give to any given publish tool. The publish tool

9(24)

take the processed data and report to the defined tool. The framework can be set on howoften to iterate this process, known as a task. [29]

In this instance, the task was set up with the collection plugin docker v.7[16]. The Dockerplugin collects the information that can be gathered from Docker engines API. The processpart is done by default setting, consisting of the collected data to be formatted as JSONfields. The publish tool used the plugin tool file v.2[27]. The file collection tool saves thegathered data and saves it to a file on a storage device. The task was set to save with asecond interval. The task is created and stopped via the Snaptel REST-API. The frameworkversion used was 1.2.0.

For the collection of energy consumption, the tool Intel® Running Average Power Limit,RAPL for short, was used. RAPL is not an analog power meter but uses software model, asseen in Section 2 the tool has been proven to give realistic values.

Figure 4: The layout for monitoring controls available. The green, blue and purple ismetric used. For some client/server also include metrics only for graphics is also available.Figure inspired from [8].

As seen in Figure 4 the RAPL can get power metrics for several parts, including Mem-ory(green), cores(purple) and package(blue). The metrics is collected from /sys/class/powercap/intel-rapl:TYPE/energy uj. TYPE in the path is for each part of the RAPL collection mentioned.The collected output from each part is in micro-joule. These values were saved to a file withthe interval of one second under the collection period of the simulation using a bash-script.

10(24)

Simulator

The simulator is a Java 8 application following the algorithm in Algorithm 1. The lowerthe first argument is provided, the more times the operation will be done, when argumentprovided is 0 it will make the calculation-iteration without any idling.

Algorithm 1 Simulator algorithm

1: procedure SIMULATOR(String[] arguments)2: sleepTime← 1000*60*1003: calucations← arguments[0]4: if arguments.length < 2 then5: Sleep(sleepTime)6: sleepTime← arguments[0]7: while true do8: calc← 09: while calc < calculations do

10: calc← calc+111: Sleep(sleepTime)

As seen in Algorithm 1 the simulation will go on forever if two arguments are given to theSimulator. If two arguments are given, then the first argument will set the sleep time inmilliseconds between each iteration. The second argument will set how many calculationswill be executed between each sleep-iteration. If less than two is delivered to the Simulator,then it will sleep for 100 minutes before terminating. The variables sleepTime and calc areof data type long.

SimulationSupervisor

The SimulatorSupervisor is forthright understood by study the flow of the following pseudo-algorithm:

1. Read and parse configFile given by program argument.

2. Check if API for Snaptel and Docker is active.

3. Start Simulation via Docker API with given argument in configFile.

4. Start collection from Snaptel and RAPL.

5. Wait for 2 minutes and 20 seconds

6. Stop Snaptel and RAPL collections.

7. Stop the Simulation

8. Parse the files of collected metrics.

9. Make MATLAB calculation to get relevant data from the metrics.

11(24)

To data not taking in metric data affected by the collection tool, the parsing will remove tenseconds from the beginning and end of each simulation, resulting in two minutes total runtime.

Experiments and results

Each experiment will be done on three different physical computers but with identical hard-ware and software settings described in Section 4.2. The ingoing variables for the Simu-lator as described in Section 4.2 can be seen in Table 1. As seen the first row will makethe simulator sleep throughout the whole simulation, keep in mind that the collection usedin Section 5 will be started after ten seconds after the simulator started disregarding theiteration-calculation done for the first column.

Table 1 Each row is a single test that will be run three times.

Sleep-time(ms)) Calucaltions0 1 000 000 000

100 1 000 000 000200 1 000 000 000300 1 000 000 000400 1 000 000 000500 1 000 000 000600 1 000 000 000700 1 000 000 000800 1 000 000 000900 1 000 000 000

1 000 1 000 000 0006 000 000 1 000 000 000

From the data collected a linear regression will be used to calculate energy consumption inregard of CPU usage for the simulation result. The CPU usage collected from Docker andpkg0+dram collected from RAPL. To get the equation

y = a+bx (1)

where y is the predicted value of y from a, b and any given value x. a is the estimatedintercept and b is the estimated slope.

b =∑(xi− x)− (yi− y)

∑(xi− x)2 (2)

a = y−bx (3)

The Pearson correlation coefficient, a measure of of the related correlation between thesets in the linear equation , as seen in equation will be calculated for each set.

PCC =Σi(xi− x)(yi− y)√

Σi(xi− x)2Σi(yi− y)2(4)

12(24)

5 Result

This section presents data from the proof of concept benchmark from Section 4.2. The rawdata is shown in Tables of Appendix A and B. The pictures in this section is displayedin Appendix C as full page width size. Each data point in this section is related to onesimulation run and one specific computer. In the figures, it is unrelated which simulation runis related to a point since the main objective is to match CPU usage and energy consumption.

(a) Simulation results of computer B. (b) Simulation results of computer C.

(c)

Figure 5: The usage of energy and memory is measured against cpu usage for each com-puter. Note the values of memory is not relative in proportion within the subfigures, this isto not affect the graphs.

The subfigure of Figure 6 is essentially the same data since the collection of data was ex-tracted for 120 seconds the joule and watt should correlate.

13(24)

(a)

(b)

Figure 6: The sum of the RAPL collection from pk0 and dram in relation to CPU usage.

Linear regression is calculated on the data from Figure 6 as shown below in Table 2 and 3.Calculation for all the data combined is also presented.

Table 2 Linear regression results, from the data CPU usage from Docker and pkg0+dramin Joule from RAPL, of each of the computer. The row ’All’ combine all the data from allcomputers.

Computer Slope Intercept PCCB 1.713582E-08 1201.385 0.9967C 1.702123E-08 1129.752 0.9969Q 1.713063E-08 1161.934 0.9957All 1.708180E-08 1164.981 0.9925

14(24)

Table 3 Linear regression results, from the data CPU usage from Docker and pkg0+dramcalculated as Watt from RAPL, of each of the computer. The row ’All’ combine all the datafrom all computers.

Computer Slope Intercept PCCB 1.427985E-10 10.0115 0.9968C 1.418435E-10 9.4146 0.9969Q 1.427553E-10 9.6828 0.9957All 1.423483E-10 9.7082 0.9925

6 Discussion

The section is divided between the discussion and conclusion of the result from the proofof concept. The related work and a follow-up discussion of predicting energy consumptionas a method of pricing.

6.1 Proof of concept

The results do not have a baseline for what the computers used was running without thesimulation started. Energy consumption estimation for the collection tool was more timeconsuming than the scope of the proof, this is regarding the snaptel tool consuming moreenergy when the simulation was running since more data was being collected, even if thiswouldn’t have affected the results in any regards since it was negligible.

The biggest reason for uncertainty consist of the simulation were not in a controlled en-vironment since the computers used was in an open laboratory environment. Due to therestraints in the laboratory environment, the containerized environment had to be in a vir-tual machine environment. With the things mentioned the results should be received as anindication of the relation of CPU usage and energy consumption. It should also be notedthat each test was only run once per computer, this should constraint the unequivocally ofthe results.

Some unrepresented and unscientific calculation was made on the computers for the wattusage of the computer when simulation and collection not activated and ended on 8.7-9.0Watt. When using all cores for 100% in 1 minute, the consumption was around 93 Wattwith the deviation of 2 Watt, the TDP4 for the CPU is 84 Watt for the processor, making thenumbers conceivable. Since the Virtual Machine was only using one core, it is probable thatthe usage of the simulation when using the full potential of the CPU resolved on 26-27 Watt.

The memory usage presented in the result is however unexplained and presumed to beused by the JVM or Docker, the usage is however really small and didn’t seem to affect theenergy consumption and was therefore disregarded.

The proof of concept outcome gives a feasible indication of predicting the energy con-sumption of the simulation data. When using the linear regression equation on the data

4Thermal Design Power

15(24)

the biggest difference was on Computer C with 0.505 Watt deviation, while using the All-equation the biggest deviation was 0.813 Watt. However, predicting the data included inthe simulation set is improper but will suffice in a proof of concept. The PCC (Pearsoncorrelation coefficient) confirms the indication of feasibility with a deviation of less than0.01 from 1.

6.2 Related work

Several of the articles from Section 3 done with high scientific quality. It varies betweenwhat have been opinionated as remarkable with the results even with similar results. Thereare no clear contradictions though but rather a consensus of the effect of containerization incomparison towards bare-metal and virtual machines regarding energy consumption. Thearticles introduced in Section 3 is of immense width within the area and relation of energyconsumption and area of distribution.

The technology advancement, essentially in containerization and IaaS, provide several ben-efits since the several benefits on hypervisor/virtual machines. Nevertheless, the securitydiscussion on containerization is not without weight. The author believes containerizationwill have many things to offer the cloud service market, but it will not be relevant until thecloud providers can offer the containerization environment without the need of wrapping itwithin a virtual machine. This is from the aspect of energy consumption and pricing.

6.3 Predicting energy consumption, pricing, and future work

It is not unreasonable to take the question at hand with a method of machine learning toproduce an artificial function for prediction of energy consumption based on metrics givenby the container software environment. Again, in regard to the proof of concept is workingtowards CPU usage, the tests must proceed towards other devices and notably memory andtraffic.

In the future work, in reference of the proof of concept, the perception of an application ismore advanced than the used test bench. Taking in account of memory usage and differentdevices such as traffic or hard-disk. The results and conclusion must be desquamated inrelating to any typical application or other benchmarks. The need of a more extensive testbench and make iterations of the experiments need to be done to take the results in consid-eration.

Whilst the question at hand there ain’t no such thing as a free lunch, it does purpose aninteresting suggestion for pricing of cloud services. As mentioned in [32] the commoncomplaints are the unfairness in pricing. Like all kinds of service markets, the one offeringthe lowest price for the same quality is commonly the one getting the customers. The mar-ket of IaaS within containerization is potentially still youthful and new providers introducedto the market it is bound to narrow the pricing down. The effective cost of a data center ismainly focused on energy consumption and the interrelation of billing, fair pricing, and cli-mate. The interest of making a direct association between energy consumption and pricingshould be entitled for cloud services within the IaaS.

16(24)

References

[1] Amazon Alexa. Alexa top 500 global sites. http://www.alexa.com/topsites.(Accessed on 05/10/2017).

[2] Amazon Alexa. Wikipedia.org traffic, demographics and competitors - alexa. http://www.alexa.com/siteinfo/wikipedia.org. (Accessed on 05/23/2017).

[3] ArchLinux. Kvm - archwiki. https://wiki.archlinux.org/index.php/KVM.(Accessed on 05/23/2017).

[4] The Kubernetes Authors. What is kubernetes? — kubernetes. https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/. (Accessed on 05/23/2017).

[5] Google Cloud. Google container engine pricing and quotas — container en-gine documentation — google cloud platform. https://cloud.google.com/container-engine/pricing. (Accessed on 05/21/2017).

[6] ComputeNext Colman E. Cloud computing basics - iaas, paas, saascomparison — computenext. https://www.computenext.com/blog/when-to-use-saas-paas-and-iaas/. (Accessed on 06/01/2017).

[7] Creative Commons. Linux containers. https://linuxcontainers.org/. (Accessedon 05/23/2017).

[8] Intel Dimitriov M. Intel® power governor — intel® software. https://software.intel.com/en-us/articles/intel-power-governor. (Accessed on 06/02/2017).

[9] Cartika Fougere R. Evolution of infrastructure-as-a-service (iaas) — cartika - cartika.https://www.cartika.com/blog/iaas/. (Accessed on 06/01/2017).

[10] Linux Foundation. The xen project, the powerful open source industry standard forvirtualization. https://www.xenproject.org/. (Accessed on 05/23/2017).

[11] CIO From IDG. What are containers and why do you needthem? — cio. http://www.cio.com/article/2924995/software/what-are-containers-and-why-do-you-need-them.html. (Accessed on05/10/2017).

[12] Docker Inc. Docker engine api and sdks - docker documentation. https://docs.docker.com/engine/api/. (Accessed on 05/21/2017).

[13] Docker Inc. What is docker? https://www.docker.com/what-docker. (Accessedon 05/20/2017).

[14] Krzywda Jakub. Analysing, modelling and controlling power-performance tradeoffsin data center infrastructures. Umea Universitet, 2017.

[15] Strauss D. Linux Journal. Containers—not virtual machines—are the futurecloud — linux journal. http://www.linuxjournal.com/content/containers%E2%80%94not-virtual-machines%E2%80%94are-future-cloud. (Accessed on05/10/2017).

17(24)

[16] Krolik M. Github - intelsdi-x/snap-plugin-collector-docker: Collectsdocker container runtime metrics. https://github.com/intelsdi-x/snap-plugin-collector-docker. (Accessed on 05/23/2017).

[17] Algreene B. Mashable. How big is the cloud?, howpublished = http://mashable.com/2012/10/04/how-big-is-the-cloud/#oyr1wu4mppqb, month = 2012, year =10, note = (Accessed on 05/10/2017).

[18] R. Morabito. Power consumption of virtualization technologies: An empirical in-vestigation. In 2015 IEEE/ACM 8th International Conference on Utility and CloudComputing (UCC), pages 522–527, Dec 2015.

[19] Preeth E N, F. J. P. Mulerickal, B. Paul, and Y. Sastri. Evaluation of docker con-tainers based on hardware utilization. In 2015 International Conference on ControlCommunication Computing India (ICCC), pages 697–700, Nov 2015.

[20] Nginx. What is load balancing? how load balancers work. https://www.nginx.com/resources/glossary/load-balancing/. (Accessed on 05/10/2017).

[21] Oracle. Oracle vm virtualbox. https://www.virtualbox.org/. (Accessed on05/22/2017).

[22] Kamp P-H. Introduction to varnish — varnish http cache. https://www.varnish-cache.org/intro/index.html#intro. (Accessed on 05/10/2017).

[23] J. Liu R. Shea, H. Wang. Power consumption of virtual machines with network trans-actions: Measurement and improvements. In IEEE INFOCOM 2014 - IEEE Confer-ence on Computer Communications, pages 1051–1059, April 2014.

[24] Solinas C. Hindle A. Santos E. A., M. Carson. How does docker affect energy con-sumption? evaluating workloads in and out of docker containers. https://arxiv.org/abs/1705.01176, 5 2017. (Accessed on 05/21/2017).

[25] Mina Sedaghat, Francisco Hernandez-Rodriguez, and Erik Elmroth. A virtual ma-chine re-packing approach to the horizontal vs. vertical elasticity trade-off for cloudautoscaling. In Proceedings of the 2013 ACM Cloud and Autonomic Computing Con-ference, CAC ’13, pages 6:1–6:10, New York, NY, USA, 2013. ACM.

[26] Amazon EC2 Container Service. Aws — amazon ec2 container service — pricing.https://aws.amazon.com/ecs/pricing/. (Accessed on 05/21/2017).

[27] Taylor T. Github - intelsdi-x/snap-plugin-publisher-file: Publishes snap metrics toa file. https://github.com/intelsdi-x/snap-plugin-publisher-file. (Ac-cessed on 05/23/2017).

[28] Shapland R. TechTarget. Cloud containers – what they are and howthey work. http://searchcloudsecurity.techtarget.com/feature/Cloud-containers-what-they-are-and-how-they-work. (Accessed on05/10/2017).

[29] Snap telemetry Home. Snap - a powerful open telemetry framework. http://snap-telemetry.io/. (Accessed on 05/10/2017).

18(24)

[30] S.K. Tesfatsion, E. Wadbro, and J. Tordsson. A combined frequency scaling and appli-cation elasticity approach for energy-efficient cloud computing. Sustainable Comput-ing: Informatics and Systems, 4(4):205 – 214, 2014. Special Issue on Energy AwareResource Management and Scheduling (EARMS).

[31] Red Hat Enterprise thildred. The history of containers – red hat enterprise linux blog.http://rhelblog.redhat.com/2015/08/28/the-history-of-containers/.(Accessed on 05/20/2017).

[32] Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping Qian, andLidong Zhou. Distributed systems meet economics: pricing in the cloud. In in Hot-Cloud’10.

[33] WikiMedia. Dashiki: Report card. https://analytics.wikimedia.org/dashboards/reportcard/#pageviews-july-2015-now/monthly-pageviews-2015-now. (Accessed on 05/10/2017).

[34] WikiMedia. Page views for wikipedia, both sites, normalized. https://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm. (Accessedon 05/10/2017).

[35] Wikimedia. Wikimedia servers - meta. https://meta.wikimedia.org/wiki/Wikimedia\_servers. (Accessed on 05/10/2017).

[36] WikiMedia. Wikimedia traffic analysis report -overview. https://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryOverview.htm. (Accessed on 05/10/2017).

19(24)

Appendix A Parsed raw data

The table will begin on the next page due to the size of the table.

20(24)

Table4:

Raw

datafrom

results.Note

thattimes

iscollected

fromR

APL

collection.Snapteldatais

theclosestin

time,m

akinga

divergenceof0.5

seconds.T

heR

APL

collectionis

inm

icrojoules.

Com

puterA

rg0Startcollection

valuesE

ndcollection

valuesTim

eD

ockerR

APL

Time

Docker

RA

PLTotalusage

CPU

TotalusageM

emory

pk0dram

coreTotalusage

CPU

TotalusageM

emory

pk0dram

coreB

021:05:44.171361240

245778734112767232

11380667523165341006225

9278145642021:07:44.554582698

11653729184512783616

11651099957265808592590

94459352600B

021:05:42.147775479

225829280312492800

14533587237568216903015

25003742388921:07:42.533203833

11613853588912513280

14794199871868685509643

251621821228B

021:06:18.334701850

265057150011272192

37452913269192960464965

22175485443121:08:18.712443202

11665474638311292672

40108932312193428250915

223361496887B

10021:09:53.731203886

138963746912771328

11787337628166308458312

9507380645721:11:53.103209968

8490900343112791808

11999414733866771113708

96298356506B

10021:09:53.010755167

208989990413869056

14917042694069189778442

25213117315621:11:53.387093733

8543904242213885440

15126423138469656622558

253324938537B

10021:10:27.014320801

78408184612357632

41421615112193925571166

22392289007521:12:27.393975922

8419664697412374016

43504055297194392511901

225085549438B

20021:11:58.969948759

117474993812754944

12009956353766794353088

9636049926721:13:58.350560243

6732617361712754944

12197645202667256383789

97399616088B

20021:11:58.243865609

191435802313754368

15135260046369675928588

25337728942821:13:58.631734208

7384206944213754368

15325098944070141985717

254412775329B

20021:12:33.172695361

134805927513819904

43612122619194415425170

22514888031021:14:33.545904386

6656136181113840384

45479557678194882207946

226159417114B

30021:14:03.891189071

117641913612337152

12209848107967278599609

9747927587821:16:03.276264584

5552276784712337152

12375393963667740473205

98339313476B

30021:14:02.076373301

91030571112906496

15329254931670151906616

25443721246321:16:01.941169980

5548692318712906496

15486187133770619228820

255230972961B

30021:14:38.736919556

104378372613201408

45564346984194902901062

22620562512221:16:38.521606215

5465405844013201408

47166505920195366692687

227012677368C

40021:16:10.181745644

101851611512763136

12386800708067767703674

9840393963621:18:10.565914823

4773722663212763136

12539840686068232888000

99170765930C

40021:16:07.544621487

96737206813791232

15495479992670641369018

25528438739021:18:07.936946176

5439488529213791232

15656155737371107115112

256107265014C

40021:16:45.326107611

134318583512374016

47294851440195393645263

22708808703621:18:45.700329464

4715641109812374016

48753530761195859569274

227788313476C

50021:18:15.164394305

114052303011350016

12547777014168251237121

9921628228721:20:15.551946055

4168973244711350016

12693068591368716583557

99913325317C

50021:18:11.441156737

60191072911304960

15662646826171121195129

25614524334721:20:11.828786546

4096703548412066816

15796840704371586164367

256763251159C

50021:18:50.610353398

86175173912795904

48832148620195879162475

22783135510221:20:50.361980361

4069821672912795904

50238648498196342727233

228484128662C

60021:22:23.653989376

59699991412771328

12771895721469210765014

10008676110821:24:23.038049721

3654691846412771328

12908524804669671995544

100733803588C

60021:22:21.939424619

73755367512455936

15882327752672088599060

25697365539521:24:21.323769450

3652428664412455936

16006834716772549475219

257520439331C

60021:22:58.722047524

67074209411350016

51053879943196839449218

22866417919921:24:58.096195189

3574114862111350016

52369001708197301687622

229247434936C

70021:24:29.124905324

67202095613242368

12917202722169696062316

10078055822721:26:29.511318048

3310228093013242368

13047883374070160728088

101376849670C

70021:24:25.587601795

61612661212742656

16013164752172566399169

25755591601521:26:25.978879619

3255735020412742656

16132627081273031053283

258064803894C

70021:25:03.784630855

59363665512955648

52448671081197324355895

22928862353521:27:03.158927302

3234256369212955648

53642847473197785798950

229790312255Q

80021:26:34.954638169

62021755412763136

13055794390870182336669

10141859863221:28:34.343668479

3007996134812763136

13179691638170643094970

101954777465Q

80021:26:30.256430526

61061407913340672

16138707312073048058410

25809722937021:28:30.649043254

2990984237913340672

16256864904773512912719

258600438842Q

80021:27:08.898875139

59348751312791808

53720687316197808508789

22983039361521:29:08.277414465

2919462497812791808

54887850402198270226257

230322969787Q

90021:28:38.377314334

60271958312935168

13185838549870659179443

10198800982621:30:38.774497151

2745687467312935168

13307287432871124007263

102507428466Q

90021:28:35.373078765

60168226412759040

16263327423073531662048

25863470971621:30:35.771198083

2718091898112759040

16371495794673996054687

259068794067Q

90021:29:12.509675806

34967140211292672

54950030334198287225341

23035545959421:31:12.893871501

2682554238511325440

56098612731198752547302

230817764831Q

100021:07:48.209150890

60305634512558336

11658083728065823255676

9450130487021:09:48.602405532

2532657442012558336

11778874163866288016174

95025709167Q

100021:07:47.188771265

61545880411321344

14802923071268703973815

25167392205821:09:47.587831988

2511572378311321344

14908574798569168332458

252083972839Q

100021:08:23.649773749

73627781512738560

40186492675193447924926

22340426318321:10:23.032135848

2489026570512738560

41357956970193909628601

223887850219Q

600000021:20:19.504897664

5880683113008896

12697575170868732355224

9993453363021:22:19.900463689

14688290113008896

12766101727269195724487

100054920837Q

600000021:20:16.327784550

5714693912259328

15802604687571603986694

25679192871021:22:16.723080014

14141110712259328

15875384143072067927490

256937554565Q

600000021:20:54.972461192

5736474212566528

50294706176196361067321

22851135095221:22:54.348316686

14106915112566528

50988405578196821618408

228629650756

21(24)

Appendix B CPU usage, Joule, and Watt

Table 5 Data from Table 4, changed to joule and calculated WattComputer Arg0 CPU Usage Joule Pkg0 Watt(Joule pkg0/120)B 0 88076070 1148.634827 9.571957B 100 24723518075 1672.664856 13.938874B 200 26854155090 1679.316650 13.994305B 300 29459743794 1699.730774 14.164423B 400 32430259974 1771.472291 14.762269B 500 35949918550 1827.521362 15.229345B 600 40549209417 1918.262208 15.985518B 700 46718710517 1995.584106 16.629868B 800 54346348711 2117.332153 17.644435B 900 66151423679 2338.919190 19.490993B 1000 83519365962 2583.426453 21.528554B 6000000 114079504504 3171.910706 26.432589C 0 84264168 1191.735351 9.931128C 100 24500264979 1520.875916 12.673966C 200 26579236717 1546.076355 12.883970C 300 29299228300 1646.430236 13.720252C 400 31941223592 1659.277405 13.827312C 500 35786732969 1705.945800 14.216215C 600 40365124755 1806.908020 15.057567C 700 53427513224 2072.503541 17.270863C 800 54576617476 2036.644225 16.972035C 900 71927711419 2364.446106 19.703718C 1000 83349142518 2560.648560 21.338738C 6000000 113880243086 3074.732971 25.622775Q 0 83704409 1154.250489 9.618754Q 100 24153987890 1633.167970 13.609733Q 200 26475870983 1613.904358 13.449203Q 300 28601137465 1628.880554 13.574005Q 400 31748927037 1655.619447 13.796829Q 500 35070406527 1777.360169 14.811335Q 600 39836464990 1870.064636 15.583872Q 700 45813225263 1924.603332 16.038361Q 800 53610274714 2065.950561 17.216255Q 900 65213302536 2334.217835 19.451815Q 1000 83412565128 2549.380920 21.244841Q 6000000 114004174883 3123.804993 26.031708

22(24)

Appendix C Result figures, full page width size

Figure 7: Graph of results computer B, CPU usage as x-axis

23(24)

Figure 8: Graph of results computer C, CPU usage as x-axis

Figure 9: Graph of results computer Q, CPU usage as x-axis

24(24)

Figure 10: Graph of results of pkg0+dram in watt, CPU usage as x-axis

Figure 11: Graph of results of pkg0+dram in Joule, CPU usage as x-axis

evaluation of energy consumption in virtualization ... · 2.4 cloud services, virtualization, and...

Documents