evaluation of energy consumption in virtualization ... · 2.4 cloud services, virtualization, and...
TRANSCRIPT
Evaluation of energy consumption invirtualization environmentsProof of concept using containers
Jonathan Westin
Jonathan WestinVT 2017Bachelor’s thesis, 15 ECTSSupervisor: P-O OstbergExaminer: Pedher JohanssonBachelor Of Science Programme in Computing Science, 180 ECTS
Abstract
The demand for cloud services offering virtualization increases with acontinual interest for different types of applications. Regardless of theresource demand of the application, some supplier is billing by the timeof usage, making it an unfair pricing for the clients.
In this thesis, the virtualization characterization and properties makeroom for another form of payment qualities. Since power consumptionis comprehensible and an understandable measurement for both partiesit is investigated if there are any useful ways of measuring energy con-sumption of an application.
The thesis creates a proof of concept of predicting energy consump-tion from CPU usage from a containerized application. With an empir-ical simulation and the use of linear regression, the results are feasibleof estimating the energy consumption from the simulation data with a0.813 Watt deviation. Even though the proof of concept is a harsh sim-plification of the dilemma for predicting energy consumption from anapplication, the thesis embraces the issue of energy consumption withindata centers.
Contents
1 Introduction 1
1.1 Problem statement 2
2 Background 2
2.1 Clouds and Clusters 2
2.2 Wikimedia, a distributed example 3
2.3 Virtualization 3
2.4 Cloud services, virtualization, and energy consumption 4
3 Related work 5
3.1 Cluster resource management related to energy consumption 5
3.2 Containers versus VMs, energy, and performance 6
3.3 Pricing the Cloud 6
4 Method 6
4.1 General method approach 7
4.2 Proof of concept 7
5 Result 12
6 Discussion 14
6.1 Proof of concept 14
6.2 Related work 15
6.3 Predicting energy consumption, pricing, and future work 15
References 16
A Parsed raw data 19
B CPU usage, Joule, and Watt 21
C Result figures, full page width size 22
1(24)
1 Introduction
Cloud services have become one of the fastest growing on the service market yet the cloudservice of renting infrastructure for customers to run an application without their own datacenter is not without complications. The customers and providers share a common goal;the best performance for the lowest financial cost. From the perspective of the provider,the integral profit comes from balancing the computers in the cluster to minimize the idleperformance while from the customers perspective it is interpreted as buying the necessaryperformance needed for their application for the lowest cost. In other words, the customersonly want to pay for what they use, and providers want to rent out the cluster as effective aspossible. This unfolds to one possible outcome customers sharing computers in the cloudin an effective manner.
Subsequently, it is essential for both parts to be able to monitor telemetry metrics as thedefinition for best performance does not translate the same for both parties. Providers needto be able to efficient manage the cluster, and the customers must be able to inspect theirapplications performance. Otherwise, the adjustment essential to make their own strategicand financial adjustments will be unredeemed for their applications.
Additionally, another obstacle occurs in the suggestion of sharing computers. The cus-tomers do not want other customers to have access to their application, but on the otherhand, the provider wants to maintain the profit of having customers to share as much of theaccessible assets as possible. This introduces the need of being able to isolate customersfrom each other for integrity and security aspect when for instance sharing the same physi-cal computer.
Virtualization, fortunate resolve the solution of isolation within a computer. It makes itpossible to receive the telemetry, security, integrity and isolation needed as previously men-tioned while sharing a computer. The norm is currently being carried out using VirtualMachines, creating an operating system within the operating system running on the com-puter. Another method of virtualization is by containerization. In recent years the latterapproach has gain popularity without limitations and have several advantages over VirtualMachines. Although the usage of the containerization is new, the theory is not. The twokind of virtualization is explained further in Section 2.3.
The common way for cloud providers is to set pricing is in allocating pure hardware. Mak-ing it near impossible to close the disparity between economic beneficial for both parties,customers want the possibility to extend and shrink the usage of performance and only payfor used performance. Meanwhile, the providers want to be able to use the cluster as bene-ficial in both performance and financially beneficial.
The general cost for cloud providers is the essential for any data center the hardware, loca-tion, and internet access. These are substantially basic and predictable costs, on the otherhand, the price of energy is furthermore uncertain. The energy is in direct affiliated to thenumber of computers, or nodes, running within the data center. To regulate it is not anunfamiliar thought to turn off the nodes that are idle; this would on the other hand omittingthe problem of acquiring upscale hardware that is unused.
2(24)
In the ideal world, the customers would not pay for anything not used and cloud providerswould use the full extent of their cluster. Hence for a non-profit cloud provider, the pricingwould be calculated to cover the anticipated cost with the addition of energy consumptionfrom each customer.
To estimate energy consumption of an application (used in a node shared with other ap-plications in a virtualized environment) the possibility to predict energy consumption byexclusively looking at the metrics provided the virtualization software for the application isneeded. As described in Section 1.1, this thesis will look at the mentioned predicament.
1.1 Problem statement
This thesis will make a proof of concept by only using telemetry metrics from a containeriz-ing software platform to estimate energy consumption of an application. From the gathereddata make a linear regression analysis to estimate the energy consumption of total cpu us-age of the application. The result from the proof of concept is then used to discuss thepossibility of pricing based on energy consumption.
2 Background
The idea behind distributed systems is simple, the creation of a computer with the perfor-mance condition to take care of a large commission without struggle is not a budget wiseeffort. Thus dividing the computing between a larger set of modest computers is more eco-nomical. Other aspects also play an integral role in the demand for distribution includingnetwork communication is often limited to the local ISP. By distribution can performancebe increased for excessive network applications (e.g., World Wide Web). Distribution doesnot come without complications; it produces other problems within software engineeringincluding the need for transparency and scalability when creating an application.
In this section will introduce current state of the cloud services, explain virtualization, anddistribution with a real example. The section will also include current research of energyconsumption from the point of view of the whole data centre down to the energy consump-tion of an application.
2.1 Clouds and Clusters
Cloud services such as computing and cloud storage is a growing industry, an estimated sizeof the clouds in 2012 exceeds more than 1 exabyte [17], the cloud-size keeps increasing. Theadvantages of clouds are the possibility of not having a physical computer, the maintenanceof updating or the need of upgrading both the underlying software and hardware is removedfrom the user.
3(24)
2.2 Wikimedia, a distributed example
The Wikimedia Foundation mostly recognized for the website Wikipedia. The website isrecorded as one of the top viewed website[1] on the Internet. Wikimedia monthly count15 b w.p req. [36][33], the subset of the English Wikipedia website is responsible for halfof the request [34] has on average 5 443 961 web requests per hour (average 1512,2 persecond) with a mean response of 0.958 seconds[2].
The incoming traffic is not intelligibly done by one single server nor is it convenient fornetwork traffic to be to the same geographical location when the users are worldwide, dis-tribution is key for survival in such cases. Not only is the distribution needed in differentgeographical places but also within the same server and cluster. A typical setup for a hightraffic web application is a load balancer[20], HTTP caching servers [22], web applicationservers (e.g. Apache/Nginx) and database, a simple case of local distribution is presentedin Figure 1. This is a minimum set up for a high traffic website. Wikimedia consist of over350 servers to manage the incoming load of requests whereas Figure 1 is only mentioningsome parts, wikimedia also include servers for caching, log collectors and so on. [35]
Figure 1: Simple prototype of a small distribution of a web application.
2.3 Virtualization
At the current time, the general practice of virtualization in the cloud is achievable by givingclients their separate virtual machine. A virtual machine makes it possible to run anotheroperating system on the operating system, this is beneficial when isolation is needed andother beneficial attributes are included such as recovery and backup are straightforwardsince the virtual machine can save the state of the machine within a snapshot. This is bene-ficial since it provides the possibility of moving the virtual machine to a new node within acluster. This gives both the security and integrity desired for the client. However, a virtualmachine costs a lot of overhead both in disk space and memory.
Another kind of virtualization is containers, also mentioned as containerization, makes itpossible to run software without the dependencies of hardware. The theory behind con-tainerization is old but have recently become popular but is due to later innovations anddevelopment made it essentially possible. This makes it possible to run applications ondifferent operating systems and disregard the operating system dependencies. Using con-tainers have the same advantages mentioned for virtual machines, isolation, and integrityfrom the server. Another advantage is the scalability of containers, a containerized appli-cation can easily be replicated or distributed autonomous instantaneous. Would one nodeof the cluster become unavailable the containers can easily and fast be moved to anothernode.[31, 11]
There exists different container software platforms (engines), Linux Containers[7], Docker[13]
4(24)
and Kubernetes[4]. They make it possible for the user to initialize their application in a con-tainer. All available to a large degree open platforms and can be used without charge.
Figure 2: Comparison between a containerization and hypervisor of a virtual machine,showing the overhead created by virtual machines.
Whilst using virtual machines still is the traditional way of virtualization, running applica-tions in virtual machines do make the application OS dependent of the virtual OS. Container,on the other hand, removes the dependency obstacle likewise as seen in Figure 2 a containersoftware platform (container engine) removes the overhead generated from the virtual ma-chine for each client. This makes cloud containerization is not as storage demanding asvirtual machines. [28, 15]
2.4 Cloud services, virtualization, and energy consumption
The cloud is divided into three kinds of services.
• Software as a Service (SaaS)
• Platform as a Service (PaaS)
• Infrastructure as a Service (IaaS)
SaaS is the commonly known part of the cloud, where a thin-client model is used, e.g.Google Apps, Facebook, Twitter, Dropbox is within the definition of SaaS. PaaS is a lowerlevel of computing, providing abstractions from the servers and deliver an environmentoften used for development. PaaS providers include for example Google App Engine, RedHat’s OpenShift. IaaS is the building blocks for cloud services. The IaaS providers make itpossible for other to deploy their applications to run on the cloud. The latter often offers theservices using virtualization software to provide the security and integrity measures needed.[6]
IaaS is estimated to have grown the most in proportion to the global public cloud consump-tion from 31.9% to 38.4% between 2015 and 2016 [9]. This increasing interest makes thedemands troublesome, neglecting the fundamental conventional expenditure including suchas physical, infrastructure and maintenance. The cost can be summed up towards energyconsumption.
5(24)
The combination of IaaS and virtualization make scaling possible. The elasticity is men-tioned as horizontal or vertical. The horizontal elasticity is balanced by increasing anddecreasing the number of virtualizations in an instance versus the vertical elasticity wherethe number of resources allocated to the virtualization is modified. [25, p. 1]
Current cloud services for IaaS take pricing in metrics such as time or number of con-tainers1 [26][5], this can in some cases seem arbitrary since the actual cost for the providertranslates in the mentioned fundamental cost and energy consumption. This is controversialsince the client with a simple application with no, or small demand of computation willhave the same pricing basis as an application making heavy computing.
3 Related work
To look at how clouds and energy consumption is managed, this section will review the re-cent advancement within the area of energy versus quality of service (QoS) and performancefrom the point of view of whole data centre down to visualization and applications.
3.1 Cluster resource management related to energy consumption
[14, p.1-3] mentions the complexity of energy consumption in a data center. The authorpoints out several different courses of actions to improve power efficiency. It is mentionedthat servers achieve the best power efficiency when the level of utilization is high and thatCPUs are not linear in power consumption. Creating the peak efficiency for CPUs binarysince they are only efficient on high utilization or when powered off.
The author continues to demonstrate three power efficient way to operate the data centerservers. The server consolidation approach where the number of applications matches theservers used to run at a high utilization level. Server throttling using by example DVFS2 orCPU pinning. This is however mentioned as a complex and not optimal approach. The lastapproach is the power budgeting where the premises is to control the power usage for thedata center to minimize the expenditures of capital and operational throughout the lifetimeof the data center. [14, p.11-14]
In [30, p.205-214] the authors consider the relationship of performance, power and differentconfigurations in cloud computing in regard of horizontal and vertical elasticity, the num-ber of cores and CPU frequency. In their experiment consist of a video-encoding scenario.They conclude it is feasible with a presented feedback controller to determine an optimalconfiguration towards minimize energy consumption and meet the performance goal with a34% energy saving in comparison to constituent approaches mentioned.
1A free limited usage in time, computing or storage space is often included2Dynamic voltage frequency scaling
6(24)
3.2 Containers versus VMs, energy, and performance
[19] makes evaluations by benchmark tools Bonnie++, psutil and performancetool to testthe performance of Docker and the authors claim Docker can be compared to running on anOS without virtualization.
In [24] extensive empirical experiment was made executing the different applications Word-press, Redis, and PostgreSQL. To collect energy usage was RAPL and WATTS UP? PROused, the WATTS UP? PRO collects metrics between the power supply and the physicalcomputer. They conclude that the difference between executing the applications proved thatDocker consumes more power than only driving the application directly on the computer.They measured that dockerd3 consume 2 Watt when dockerd idle.
In [18] an empirical comparison between virtual machines (KVM[3] and Xen[10]) and con-tainers (LXC[7], Docker[13]) using an external power measurement device (Raritan 22).They conclude that both hypervisors and containers create an overhead in energy consump-tion, it is also pointed out that the hypervisors consume in most cases more power in theirnetwork performance. This is also shown in [23] where it is mentioned that a Virtual Ma-chine can use up to 40% more energy for network communication and take up to 5 timesthe number of cycles to deliver a packet compared to bare-metal. Whilst the containervirtualization compared was close to identical of the bare-metal.
3.3 Pricing the Cloud
[32] make a comparison between cloud providers (Amazon Web Services, Google AppEngine, and Windows Azure) and pricing to understand the implications of the paradigmof distributed systems and economics. The authors provide tests with different benchmarkstowards choosing the best cloud provider in regard to the different usage of computing intheir experiment. The benchmark difference represents applications with a different mainobject, the benchmark differed between I/O, high-performance computing, large-scale dataprocessing and storage archival. Their conclusion is not inconclusive, but while runningthe experiment some bugs occurred, resulting in some test taking more time than initiallyprepared. This points out troublesome facts concerning the relation between the underlyinginfrastructure when pricing in time.
[32] mentions that if both the providers and users will look at the pricing to optimize thefinancial value for the service, this is a direct indication that pricing has a large value towardscreating an energy efficient system.
4 Method
To calculate the energy consumption for a containerized application by only using metricsfrom the container engine a test bench needs to be constructed. To produce a feasible testbench require a numerous of empirical experiments must be performed and analyzed. InSection 4.1 a general approach is defined and in Section 4.2 is a subset of the general ap-proach to creating a proof of concept. The results of 4.2 will be in Section 5.
3The Docker daemon
7(24)
4.1 General method approach
An external power measurement tool should be included to be able to determine the com-puter overall energy consumption. The number and brand of sensors included to the testbench are limited to the container engines metric output. It is not probable to collect met-rics that can not be used in calculations. An internal power measurement sensor is needed tocollect CPU and memory power measurement and other hardware devices, the work beingexecuted by the respective device will be mentioned as variables below.
Initially, the experiment should measure the computer without the container software plat-form should be collected. The experiment should also include a measurement with thecontainer software platform. Also, a measurement the test bench application running. Thiswill give the minimum energy consumption used by the computer when executing the testbench. These three values will give a possibility to calculate the difference between runningonly the operating system, running the operating system and container software platformand finally the operating system with the test bench containerized.
The next step should include the use of single variables, in an instance without perform-ing work on other variables from the tool point of view. The variable should increase itsvalue with a reliable increment of the variables workload that will exist in and includingidle and full performance. Each test must be repeated numerous times for validation.
At this state, it should be possible to make a regression analysis to create a function toprovide energy consumption in regard to the device performance. A standard aberrationshould also be attainable.
The following benchmark tests should increment the number of variables tested togetherand make every permutation of the experiment. This should be iterated until all variablesare tested to create a complete graph of all the experiment. From each iteration, a multivari-able regression analysis is calculated.
4.2 Proof of concept
The following section will describe the proof of concept and the underlying parts are set up.This report will measure the metrics, given by the container software platform, in this case,Docker to be able to calculate a regression analysis. It will only look at the benchmark ofhow CPU utilization is measured against energy consumption.
Simulation setup
The simulation tool (furthermore also test bench) was run within a containerization on avirtual machine on a computer as seen in Figure 3.
8(24)
Figure 3: The simulation setup consists of a computer with a containerized Simulator anda program created to overview the handling of simulation called SimulationSupervisor.
Computer
The operating system of the computers consist of Linux Debian 3.16.43-2 x86 64 with anIntel(R) Core(TM) i7-4770 CPU @ 3.40GHz processor 4 cores and 8 threads with 32GBRAM,
Virtual machine
The virtual machine operates with the hypervisor VirtualBox [21] and emulates Linux 4.8.0-52-generic x86 64 with one core and 8GB RAM.
Container environment
The container engine consists of Docker 17.03.1-ce with default settings except activationof the REST-API[12].
Collection tools
For the collection of metrics the open source framework Snap telemetry or snaptel for short.The framework is divided into three parts, collect, process and publish. Each part has aplugin to collect, process or publish to or from a different software metric. The collectionpart gathers metrics from different software. The process part takes all the collected dataand transforms or modify the data so it can give to any given publish tool. The publish tool
9(24)
take the processed data and report to the defined tool. The framework can be set on howoften to iterate this process, known as a task. [29]
In this instance, the task was set up with the collection plugin docker v.7[16]. The Dockerplugin collects the information that can be gathered from Docker engines API. The processpart is done by default setting, consisting of the collected data to be formatted as JSONfields. The publish tool used the plugin tool file v.2[27]. The file collection tool saves thegathered data and saves it to a file on a storage device. The task was set to save with asecond interval. The task is created and stopped via the Snaptel REST-API. The frameworkversion used was 1.2.0.
For the collection of energy consumption, the tool Intel® Running Average Power Limit,RAPL for short, was used. RAPL is not an analog power meter but uses software model, asseen in Section 2 the tool has been proven to give realistic values.
Figure 4: The layout for monitoring controls available. The green, blue and purple ismetric used. For some client/server also include metrics only for graphics is also available.Figure inspired from [8].
As seen in Figure 4 the RAPL can get power metrics for several parts, including Mem-ory(green), cores(purple) and package(blue). The metrics is collected from /sys/class/powercap/intel-rapl:TYPE/energy uj. TYPE in the path is for each part of the RAPL collection mentioned.The collected output from each part is in micro-joule. These values were saved to a file withthe interval of one second under the collection period of the simulation using a bash-script.
10(24)
Simulator
The simulator is a Java 8 application following the algorithm in Algorithm 1. The lowerthe first argument is provided, the more times the operation will be done, when argumentprovided is 0 it will make the calculation-iteration without any idling.
Algorithm 1 Simulator algorithm
1: procedure SIMULATOR(String[] arguments)2: sleepTime← 1000*60*1003: calucations← arguments[0]4: if arguments.length < 2 then5: Sleep(sleepTime)6: sleepTime← arguments[0]7: while true do8: calc← 09: while calc < calculations do
10: calc← calc+111: Sleep(sleepTime)
As seen in Algorithm 1 the simulation will go on forever if two arguments are given to theSimulator. If two arguments are given, then the first argument will set the sleep time inmilliseconds between each iteration. The second argument will set how many calculationswill be executed between each sleep-iteration. If less than two is delivered to the Simulator,then it will sleep for 100 minutes before terminating. The variables sleepTime and calc areof data type long.
SimulationSupervisor
The SimulatorSupervisor is forthright understood by study the flow of the following pseudo-algorithm:
1. Read and parse configFile given by program argument.
2. Check if API for Snaptel and Docker is active.
3. Start Simulation via Docker API with given argument in configFile.
4. Start collection from Snaptel and RAPL.
5. Wait for 2 minutes and 20 seconds
6. Stop Snaptel and RAPL collections.
7. Stop the Simulation
8. Parse the files of collected metrics.
9. Make MATLAB calculation to get relevant data from the metrics.
11(24)
To data not taking in metric data affected by the collection tool, the parsing will remove tenseconds from the beginning and end of each simulation, resulting in two minutes total runtime.
Experiments and results
Each experiment will be done on three different physical computers but with identical hard-ware and software settings described in Section 4.2. The ingoing variables for the Simu-lator as described in Section 4.2 can be seen in Table 1. As seen the first row will makethe simulator sleep throughout the whole simulation, keep in mind that the collection usedin Section 5 will be started after ten seconds after the simulator started disregarding theiteration-calculation done for the first column.
Table 1 Each row is a single test that will be run three times.
Sleep-time(ms)) Calucaltions0 1 000 000 000
100 1 000 000 000200 1 000 000 000300 1 000 000 000400 1 000 000 000500 1 000 000 000600 1 000 000 000700 1 000 000 000800 1 000 000 000900 1 000 000 000
1 000 1 000 000 0006 000 000 1 000 000 000
From the data collected a linear regression will be used to calculate energy consumption inregard of CPU usage for the simulation result. The CPU usage collected from Docker andpkg0+dram collected from RAPL. To get the equation
y = a+bx (1)
where y is the predicted value of y from a, b and any given value x. a is the estimatedintercept and b is the estimated slope.
b =∑(xi− x)− (yi− y)
∑(xi− x)2 (2)
a = y−bx (3)
The Pearson correlation coefficient, a measure of of the related correlation between thesets in the linear equation , as seen in equation will be calculated for each set.
PCC =Σi(xi− x)(yi− y)√
Σi(xi− x)2Σi(yi− y)2(4)
12(24)
5 Result
This section presents data from the proof of concept benchmark from Section 4.2. The rawdata is shown in Tables of Appendix A and B. The pictures in this section is displayedin Appendix C as full page width size. Each data point in this section is related to onesimulation run and one specific computer. In the figures, it is unrelated which simulation runis related to a point since the main objective is to match CPU usage and energy consumption.
(a) Simulation results of computer B. (b) Simulation results of computer C.
(c)
Figure 5: The usage of energy and memory is measured against cpu usage for each com-puter. Note the values of memory is not relative in proportion within the subfigures, this isto not affect the graphs.
The subfigure of Figure 6 is essentially the same data since the collection of data was ex-tracted for 120 seconds the joule and watt should correlate.
13(24)
(a)
(b)
Figure 6: The sum of the RAPL collection from pk0 and dram in relation to CPU usage.
Linear regression is calculated on the data from Figure 6 as shown below in Table 2 and 3.Calculation for all the data combined is also presented.
Table 2 Linear regression results, from the data CPU usage from Docker and pkg0+dramin Joule from RAPL, of each of the computer. The row ’All’ combine all the data from allcomputers.
Computer Slope Intercept PCCB 1.713582E-08 1201.385 0.9967C 1.702123E-08 1129.752 0.9969Q 1.713063E-08 1161.934 0.9957All 1.708180E-08 1164.981 0.9925
14(24)
Table 3 Linear regression results, from the data CPU usage from Docker and pkg0+dramcalculated as Watt from RAPL, of each of the computer. The row ’All’ combine all the datafrom all computers.
Computer Slope Intercept PCCB 1.427985E-10 10.0115 0.9968C 1.418435E-10 9.4146 0.9969Q 1.427553E-10 9.6828 0.9957All 1.423483E-10 9.7082 0.9925
6 Discussion
The section is divided between the discussion and conclusion of the result from the proofof concept. The related work and a follow-up discussion of predicting energy consumptionas a method of pricing.
6.1 Proof of concept
The results do not have a baseline for what the computers used was running without thesimulation started. Energy consumption estimation for the collection tool was more timeconsuming than the scope of the proof, this is regarding the snaptel tool consuming moreenergy when the simulation was running since more data was being collected, even if thiswouldn’t have affected the results in any regards since it was negligible.
The biggest reason for uncertainty consist of the simulation were not in a controlled en-vironment since the computers used was in an open laboratory environment. Due to therestraints in the laboratory environment, the containerized environment had to be in a vir-tual machine environment. With the things mentioned the results should be received as anindication of the relation of CPU usage and energy consumption. It should also be notedthat each test was only run once per computer, this should constraint the unequivocally ofthe results.
Some unrepresented and unscientific calculation was made on the computers for the wattusage of the computer when simulation and collection not activated and ended on 8.7-9.0Watt. When using all cores for 100% in 1 minute, the consumption was around 93 Wattwith the deviation of 2 Watt, the TDP4 for the CPU is 84 Watt for the processor, making thenumbers conceivable. Since the Virtual Machine was only using one core, it is probable thatthe usage of the simulation when using the full potential of the CPU resolved on 26-27 Watt.
The memory usage presented in the result is however unexplained and presumed to beused by the JVM or Docker, the usage is however really small and didn’t seem to affect theenergy consumption and was therefore disregarded.
The proof of concept outcome gives a feasible indication of predicting the energy con-sumption of the simulation data. When using the linear regression equation on the data
4Thermal Design Power
15(24)
the biggest difference was on Computer C with 0.505 Watt deviation, while using the All-equation the biggest deviation was 0.813 Watt. However, predicting the data included inthe simulation set is improper but will suffice in a proof of concept. The PCC (Pearsoncorrelation coefficient) confirms the indication of feasibility with a deviation of less than0.01 from 1.
6.2 Related work
Several of the articles from Section 3 done with high scientific quality. It varies betweenwhat have been opinionated as remarkable with the results even with similar results. Thereare no clear contradictions though but rather a consensus of the effect of containerization incomparison towards bare-metal and virtual machines regarding energy consumption. Thearticles introduced in Section 3 is of immense width within the area and relation of energyconsumption and area of distribution.
The technology advancement, essentially in containerization and IaaS, provide several ben-efits since the several benefits on hypervisor/virtual machines. Nevertheless, the securitydiscussion on containerization is not without weight. The author believes containerizationwill have many things to offer the cloud service market, but it will not be relevant until thecloud providers can offer the containerization environment without the need of wrapping itwithin a virtual machine. This is from the aspect of energy consumption and pricing.
6.3 Predicting energy consumption, pricing, and future work
It is not unreasonable to take the question at hand with a method of machine learning toproduce an artificial function for prediction of energy consumption based on metrics givenby the container software environment. Again, in regard to the proof of concept is workingtowards CPU usage, the tests must proceed towards other devices and notably memory andtraffic.
In the future work, in reference of the proof of concept, the perception of an application ismore advanced than the used test bench. Taking in account of memory usage and differentdevices such as traffic or hard-disk. The results and conclusion must be desquamated inrelating to any typical application or other benchmarks. The need of a more extensive testbench and make iterations of the experiments need to be done to take the results in consid-eration.
Whilst the question at hand there ain’t no such thing as a free lunch, it does purpose aninteresting suggestion for pricing of cloud services. As mentioned in [32] the commoncomplaints are the unfairness in pricing. Like all kinds of service markets, the one offeringthe lowest price for the same quality is commonly the one getting the customers. The mar-ket of IaaS within containerization is potentially still youthful and new providers introducedto the market it is bound to narrow the pricing down. The effective cost of a data center ismainly focused on energy consumption and the interrelation of billing, fair pricing, and cli-mate. The interest of making a direct association between energy consumption and pricingshould be entitled for cloud services within the IaaS.
16(24)
References
[1] Amazon Alexa. Alexa top 500 global sites. http://www.alexa.com/topsites.(Accessed on 05/10/2017).
[2] Amazon Alexa. Wikipedia.org traffic, demographics and competitors - alexa. http://www.alexa.com/siteinfo/wikipedia.org. (Accessed on 05/23/2017).
[3] ArchLinux. Kvm - archwiki. https://wiki.archlinux.org/index.php/KVM.(Accessed on 05/23/2017).
[4] The Kubernetes Authors. What is kubernetes? — kubernetes. https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/. (Accessed on 05/23/2017).
[5] Google Cloud. Google container engine pricing and quotas — container en-gine documentation — google cloud platform. https://cloud.google.com/container-engine/pricing. (Accessed on 05/21/2017).
[6] ComputeNext Colman E. Cloud computing basics - iaas, paas, saascomparison — computenext. https://www.computenext.com/blog/when-to-use-saas-paas-and-iaas/. (Accessed on 06/01/2017).
[7] Creative Commons. Linux containers. https://linuxcontainers.org/. (Accessedon 05/23/2017).
[8] Intel Dimitriov M. Intel® power governor — intel® software. https://software.intel.com/en-us/articles/intel-power-governor. (Accessed on 06/02/2017).
[9] Cartika Fougere R. Evolution of infrastructure-as-a-service (iaas) — cartika - cartika.https://www.cartika.com/blog/iaas/. (Accessed on 06/01/2017).
[10] Linux Foundation. The xen project, the powerful open source industry standard forvirtualization. https://www.xenproject.org/. (Accessed on 05/23/2017).
[11] CIO From IDG. What are containers and why do you needthem? — cio. http://www.cio.com/article/2924995/software/what-are-containers-and-why-do-you-need-them.html. (Accessed on05/10/2017).
[12] Docker Inc. Docker engine api and sdks - docker documentation. https://docs.docker.com/engine/api/. (Accessed on 05/21/2017).
[13] Docker Inc. What is docker? https://www.docker.com/what-docker. (Accessedon 05/20/2017).
[14] Krzywda Jakub. Analysing, modelling and controlling power-performance tradeoffsin data center infrastructures. Umea Universitet, 2017.
[15] Strauss D. Linux Journal. Containers—not virtual machines—are the futurecloud — linux journal. http://www.linuxjournal.com/content/containers%E2%80%94not-virtual-machines%E2%80%94are-future-cloud. (Accessed on05/10/2017).
17(24)
[16] Krolik M. Github - intelsdi-x/snap-plugin-collector-docker: Collectsdocker container runtime metrics. https://github.com/intelsdi-x/snap-plugin-collector-docker. (Accessed on 05/23/2017).
[17] Algreene B. Mashable. How big is the cloud?, howpublished = http://mashable.com/2012/10/04/how-big-is-the-cloud/#oyr1wu4mppqb, month = 2012, year =10, note = (Accessed on 05/10/2017).
[18] R. Morabito. Power consumption of virtualization technologies: An empirical in-vestigation. In 2015 IEEE/ACM 8th International Conference on Utility and CloudComputing (UCC), pages 522–527, Dec 2015.
[19] Preeth E N, F. J. P. Mulerickal, B. Paul, and Y. Sastri. Evaluation of docker con-tainers based on hardware utilization. In 2015 International Conference on ControlCommunication Computing India (ICCC), pages 697–700, Nov 2015.
[20] Nginx. What is load balancing? how load balancers work. https://www.nginx.com/resources/glossary/load-balancing/. (Accessed on 05/10/2017).
[21] Oracle. Oracle vm virtualbox. https://www.virtualbox.org/. (Accessed on05/22/2017).
[22] Kamp P-H. Introduction to varnish — varnish http cache. https://www.varnish-cache.org/intro/index.html#intro. (Accessed on 05/10/2017).
[23] J. Liu R. Shea, H. Wang. Power consumption of virtual machines with network trans-actions: Measurement and improvements. In IEEE INFOCOM 2014 - IEEE Confer-ence on Computer Communications, pages 1051–1059, April 2014.
[24] Solinas C. Hindle A. Santos E. A., M. Carson. How does docker affect energy con-sumption? evaluating workloads in and out of docker containers. https://arxiv.org/abs/1705.01176, 5 2017. (Accessed on 05/21/2017).
[25] Mina Sedaghat, Francisco Hernandez-Rodriguez, and Erik Elmroth. A virtual ma-chine re-packing approach to the horizontal vs. vertical elasticity trade-off for cloudautoscaling. In Proceedings of the 2013 ACM Cloud and Autonomic Computing Con-ference, CAC ’13, pages 6:1–6:10, New York, NY, USA, 2013. ACM.
[26] Amazon EC2 Container Service. Aws — amazon ec2 container service — pricing.https://aws.amazon.com/ecs/pricing/. (Accessed on 05/21/2017).
[27] Taylor T. Github - intelsdi-x/snap-plugin-publisher-file: Publishes snap metrics toa file. https://github.com/intelsdi-x/snap-plugin-publisher-file. (Ac-cessed on 05/23/2017).
[28] Shapland R. TechTarget. Cloud containers – what they are and howthey work. http://searchcloudsecurity.techtarget.com/feature/Cloud-containers-what-they-are-and-how-they-work. (Accessed on05/10/2017).
[29] Snap telemetry Home. Snap - a powerful open telemetry framework. http://snap-telemetry.io/. (Accessed on 05/10/2017).
18(24)
[30] S.K. Tesfatsion, E. Wadbro, and J. Tordsson. A combined frequency scaling and appli-cation elasticity approach for energy-efficient cloud computing. Sustainable Comput-ing: Informatics and Systems, 4(4):205 – 214, 2014. Special Issue on Energy AwareResource Management and Scheduling (EARMS).
[31] Red Hat Enterprise thildred. The history of containers – red hat enterprise linux blog.http://rhelblog.redhat.com/2015/08/28/the-history-of-containers/.(Accessed on 05/20/2017).
[32] Hongyi Wang, Qingfeng Jing, Rishan Chen, Bingsheng He, Zhengping Qian, andLidong Zhou. Distributed systems meet economics: pricing in the cloud. In in Hot-Cloud’10.
[33] WikiMedia. Dashiki: Report card. https://analytics.wikimedia.org/dashboards/reportcard/#pageviews-july-2015-now/monthly-pageviews-2015-now. (Accessed on 05/10/2017).
[34] WikiMedia. Page views for wikipedia, both sites, normalized. https://stats.wikimedia.org/EN/TablesPageViewsMonthlyCombined.htm. (Accessedon 05/10/2017).
[35] Wikimedia. Wikimedia servers - meta. https://meta.wikimedia.org/wiki/Wikimedia\_servers. (Accessed on 05/10/2017).
[36] WikiMedia. Wikimedia traffic analysis report -overview. https://stats.wikimedia.org/wikimedia/squids/SquidReportPageViewsPerCountryOverview.htm. (Accessed on 05/10/2017).
19(24)
Appendix A Parsed raw data
The table will begin on the next page due to the size of the table.
20(24)
Table4:
Raw
datafrom
results.Note
thattimes
iscollected
fromR
APL
collection.Snapteldatais
theclosestin
time,m
akinga
divergenceof0.5
seconds.T
heR
APL
collectionis
inm
icrojoules.
Com
puterA
rg0Startcollection
valuesE
ndcollection
valuesTim
eD
ockerR
APL
Time
Docker
RA
PLTotalusage
CPU
TotalusageM
emory
pk0dram
coreTotalusage
CPU
TotalusageM
emory
pk0dram
coreB
021:05:44.171361240
245778734112767232
11380667523165341006225
9278145642021:07:44.554582698
11653729184512783616
11651099957265808592590
94459352600B
021:05:42.147775479
225829280312492800
14533587237568216903015
25003742388921:07:42.533203833
11613853588912513280
14794199871868685509643
251621821228B
021:06:18.334701850
265057150011272192
37452913269192960464965
22175485443121:08:18.712443202
11665474638311292672
40108932312193428250915
223361496887B
10021:09:53.731203886
138963746912771328
11787337628166308458312
9507380645721:11:53.103209968
8490900343112791808
11999414733866771113708
96298356506B
10021:09:53.010755167
208989990413869056
14917042694069189778442
25213117315621:11:53.387093733
8543904242213885440
15126423138469656622558
253324938537B
10021:10:27.014320801
78408184612357632
41421615112193925571166
22392289007521:12:27.393975922
8419664697412374016
43504055297194392511901
225085549438B
20021:11:58.969948759
117474993812754944
12009956353766794353088
9636049926721:13:58.350560243
6732617361712754944
12197645202667256383789
97399616088B
20021:11:58.243865609
191435802313754368
15135260046369675928588
25337728942821:13:58.631734208
7384206944213754368
15325098944070141985717
254412775329B
20021:12:33.172695361
134805927513819904
43612122619194415425170
22514888031021:14:33.545904386
6656136181113840384
45479557678194882207946
226159417114B
30021:14:03.891189071
117641913612337152
12209848107967278599609
9747927587821:16:03.276264584
5552276784712337152
12375393963667740473205
98339313476B
30021:14:02.076373301
91030571112906496
15329254931670151906616
25443721246321:16:01.941169980
5548692318712906496
15486187133770619228820
255230972961B
30021:14:38.736919556
104378372613201408
45564346984194902901062
22620562512221:16:38.521606215
5465405844013201408
47166505920195366692687
227012677368C
40021:16:10.181745644
101851611512763136
12386800708067767703674
9840393963621:18:10.565914823
4773722663212763136
12539840686068232888000
99170765930C
40021:16:07.544621487
96737206813791232
15495479992670641369018
25528438739021:18:07.936946176
5439488529213791232
15656155737371107115112
256107265014C
40021:16:45.326107611
134318583512374016
47294851440195393645263
22708808703621:18:45.700329464
4715641109812374016
48753530761195859569274
227788313476C
50021:18:15.164394305
114052303011350016
12547777014168251237121
9921628228721:20:15.551946055
4168973244711350016
12693068591368716583557
99913325317C
50021:18:11.441156737
60191072911304960
15662646826171121195129
25614524334721:20:11.828786546
4096703548412066816
15796840704371586164367
256763251159C
50021:18:50.610353398
86175173912795904
48832148620195879162475
22783135510221:20:50.361980361
4069821672912795904
50238648498196342727233
228484128662C
60021:22:23.653989376
59699991412771328
12771895721469210765014
10008676110821:24:23.038049721
3654691846412771328
12908524804669671995544
100733803588C
60021:22:21.939424619
73755367512455936
15882327752672088599060
25697365539521:24:21.323769450
3652428664412455936
16006834716772549475219
257520439331C
60021:22:58.722047524
67074209411350016
51053879943196839449218
22866417919921:24:58.096195189
3574114862111350016
52369001708197301687622
229247434936C
70021:24:29.124905324
67202095613242368
12917202722169696062316
10078055822721:26:29.511318048
3310228093013242368
13047883374070160728088
101376849670C
70021:24:25.587601795
61612661212742656
16013164752172566399169
25755591601521:26:25.978879619
3255735020412742656
16132627081273031053283
258064803894C
70021:25:03.784630855
59363665512955648
52448671081197324355895
22928862353521:27:03.158927302
3234256369212955648
53642847473197785798950
229790312255Q
80021:26:34.954638169
62021755412763136
13055794390870182336669
10141859863221:28:34.343668479
3007996134812763136
13179691638170643094970
101954777465Q
80021:26:30.256430526
61061407913340672
16138707312073048058410
25809722937021:28:30.649043254
2990984237913340672
16256864904773512912719
258600438842Q
80021:27:08.898875139
59348751312791808
53720687316197808508789
22983039361521:29:08.277414465
2919462497812791808
54887850402198270226257
230322969787Q
90021:28:38.377314334
60271958312935168
13185838549870659179443
10198800982621:30:38.774497151
2745687467312935168
13307287432871124007263
102507428466Q
90021:28:35.373078765
60168226412759040
16263327423073531662048
25863470971621:30:35.771198083
2718091898112759040
16371495794673996054687
259068794067Q
90021:29:12.509675806
34967140211292672
54950030334198287225341
23035545959421:31:12.893871501
2682554238511325440
56098612731198752547302
230817764831Q
100021:07:48.209150890
60305634512558336
11658083728065823255676
9450130487021:09:48.602405532
2532657442012558336
11778874163866288016174
95025709167Q
100021:07:47.188771265
61545880411321344
14802923071268703973815
25167392205821:09:47.587831988
2511572378311321344
14908574798569168332458
252083972839Q
100021:08:23.649773749
73627781512738560
40186492675193447924926
22340426318321:10:23.032135848
2489026570512738560
41357956970193909628601
223887850219Q
600000021:20:19.504897664
5880683113008896
12697575170868732355224
9993453363021:22:19.900463689
14688290113008896
12766101727269195724487
100054920837Q
600000021:20:16.327784550
5714693912259328
15802604687571603986694
25679192871021:22:16.723080014
14141110712259328
15875384143072067927490
256937554565Q
600000021:20:54.972461192
5736474212566528
50294706176196361067321
22851135095221:22:54.348316686
14106915112566528
50988405578196821618408
228629650756
21(24)
Appendix B CPU usage, Joule, and Watt
Table 5 Data from Table 4, changed to joule and calculated WattComputer Arg0 CPU Usage Joule Pkg0 Watt(Joule pkg0/120)B 0 88076070 1148.634827 9.571957B 100 24723518075 1672.664856 13.938874B 200 26854155090 1679.316650 13.994305B 300 29459743794 1699.730774 14.164423B 400 32430259974 1771.472291 14.762269B 500 35949918550 1827.521362 15.229345B 600 40549209417 1918.262208 15.985518B 700 46718710517 1995.584106 16.629868B 800 54346348711 2117.332153 17.644435B 900 66151423679 2338.919190 19.490993B 1000 83519365962 2583.426453 21.528554B 6000000 114079504504 3171.910706 26.432589C 0 84264168 1191.735351 9.931128C 100 24500264979 1520.875916 12.673966C 200 26579236717 1546.076355 12.883970C 300 29299228300 1646.430236 13.720252C 400 31941223592 1659.277405 13.827312C 500 35786732969 1705.945800 14.216215C 600 40365124755 1806.908020 15.057567C 700 53427513224 2072.503541 17.270863C 800 54576617476 2036.644225 16.972035C 900 71927711419 2364.446106 19.703718C 1000 83349142518 2560.648560 21.338738C 6000000 113880243086 3074.732971 25.622775Q 0 83704409 1154.250489 9.618754Q 100 24153987890 1633.167970 13.609733Q 200 26475870983 1613.904358 13.449203Q 300 28601137465 1628.880554 13.574005Q 400 31748927037 1655.619447 13.796829Q 500 35070406527 1777.360169 14.811335Q 600 39836464990 1870.064636 15.583872Q 700 45813225263 1924.603332 16.038361Q 800 53610274714 2065.950561 17.216255Q 900 65213302536 2334.217835 19.451815Q 1000 83412565128 2549.380920 21.244841Q 6000000 114004174883 3123.804993 26.031708
22(24)
Appendix C Result figures, full page width size
Figure 7: Graph of results computer B, CPU usage as x-axis
23(24)
Figure 8: Graph of results computer C, CPU usage as x-axis
Figure 9: Graph of results computer Q, CPU usage as x-axis
24(24)
Figure 10: Graph of results of pkg0+dram in watt, CPU usage as x-axis
Figure 11: Graph of results of pkg0+dram in Joule, CPU usage as x-axis