review based analysis on performance of cloud …journalstd.com/gallery/4-oct2019.pdfreview based...

13
Review Based Analysis on Performance of Cloud Platforms 1 G.Kavitha , 2 Shobini.B and 3 G.Shiva Krishna 1 Computer science and Engineering ,Swathi Institute of Technology & Sciences Near Ramoji Film City Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512 2 Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512 3 Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512 [email protected] ,[email protected] ,[email protected] Abstract Cloud has emerged as a platform for outsourcing data and computations. Cloud Service Provider (CSP) maintains cloud and gives services to cloud consumers. There will be agreements between provider and consumer with respect to different services. Different cloud platforms are available in the real world. They have specific services and deployment models that are widely used. As the cloud services are affordable, outsourcing data and computing became a common phenomenon for organizations. In this context, there are many aspects like cost, security, ease of use and other parameters that can help users to make well informed decisions. To analyze performance of cloud platforms, we developed a framework in this paper. The framework helps users and administrators to interact with the system. Users can provide reviews on various cloud platforms. We proposed an algorithm for analyzing reviews and come up with useful information. The algorithm is evaluated with a prototype and the results showed the usefulness of the proposed system. Keywords Cloud computing, cloud performance, user reviews, clustering, business intelligence Science, Technology and Development Volume VIII Issue X OCTOBER 2019 ISSN : 0950-0707 Page No : 29

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Review Based Analysis on Performance of

Cloud Platforms

1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

1 Computer science and Engineering ,Swathi Institute of Technology & Sciences Near Ramoji Film City

Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512

2 Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City

Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512

3Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City

Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512

[email protected] ,[email protected] ,[email protected]

Abstract

Cloud has emerged as a platform for outsourcing data and computations. Cloud Service Provider (CSP)

maintains cloud and gives services to cloud consumers. There will be agreements between provider and

consumer with respect to different services. Different cloud platforms are available in the real world.

They have specific services and deployment models that are widely used. As the cloud services are

affordable, outsourcing data and computing became a common phenomenon for organizations. In this

context, there are many aspects like cost, security, ease of use and other parameters that can help users to

make well informed decisions. To analyze performance of cloud platforms, we developed a framework in

this paper. The framework helps users and administrators to interact with the system. Users can provide

reviews on various cloud platforms. We proposed an algorithm for analyzing reviews and come up with

useful information. The algorithm is evaluated with a prototype and the results showed the usefulness of

the proposed system.

Keywords – Cloud computing, cloud performance, user reviews, clustering, business intelligence

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 29

Page 2: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

1. INTRODUCTION

Cloud computing services have become the paradigm of large scale infrastructure where a third

party provides shared virtual computing and storage resources to a client. The use of the third party

infrastructure translates into cost reductions for the client who does not invest in infrastructure and

maintenance. However, the negotiation of Service Level Agreements (SLAs) in Infrastructure-as-a-

Service (IaaS) in the cloud remains a challenging problem. One of the main difficulties of guaranteeing

performance in the cloud is the inherent randomness produced by the massive amount of time-varying

interactions and events taking place in the system. However, it is possible to optimize different

performance measures on the average based on the amount and size of the available virtual resources.

Figure 1: Benefits of cloud computing

As shown in Figure 1, there are many benefits of cloud computing that makes it an attractive

solution to enterprises in the real world. It increases productivity as it bestows pre-configured

environments, software, hardware, servers and a host of other computing resources. It provides a low cost

solution to computing and storage as it is based on virtualization that makes cloud computing affordable.

Flexibility is another advantage which enables elasticity in providing computing resources based on the

demand of users. It helps in collaboration with other partners in research and other activities. Business

collaborations make the growth of organizations easier. Maintaining software and updating the same

BENEFITS OF

CLOUD COMPUTING

LOWCOST

ow cost

MOBILITY

DATALOST

PREVENTION

SECURITY

PRODUCTIVITY

COLLABORATION

FLEXIBILITY

SOFTWARE UPDATE

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 30

Page 3: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

becomes easier as it avoids traditional cumbersome work of installations at client side and maintenance.

Cloud computing enables mobility as the users can access it from anywhere in the world. It also helps in

getting services from different vendors. Cloud computing, as claimed by providers, has its mechanisms to

prevent data loss and provide secure means of accessing data. Nevertheless, cloud servers are considered

untrusted from user point of view.

Although many approaches have been proposed to overcome this problem, commercial clouds have not

been able to implement efficient systems where users pay for specific performance measures such as CPU

and memory utilization rather than a flat hourly rate service. In commercial IaaS, such as the one

provided by Amazon Web Services, Amazon Elastic Compute Cloud (AWS EC2), the set of possible

configurations only allows for coarse-grained variations of the control inputs of the system. This imposes

a challenge to apply performance regulation using existing techniques based on model identification and

control systems theory.

In the wake of online reviews on cloud platforms, it is essential to have mechanisms to process reviews

and help users to make well informed decisions. Towards this end, in this paper, we proposed a

framework for analyzing reviews of cloud computing platforms given by users. These reviews are

processed in order to find aspect based information related to different aspects like price, security and so

on. The contributions of this paper are as follows.

1. We proposed a framework for mining online reviews on performance of different cloud

platforms.

2. We proposed an algorithm known as Cloud Platforms Review Data Analysis (CP-RDA) to gain

intelligence on the performance of cloudplatforms based on user reviews.

3. An application is built to evaluate the proposed algorithm and provide aspect based analysis of

cloud performance.

The remainder of the paper is structured as follows. Section 2 presents review of literature. Section 3

presents the proposed framework and underlying algorithm. Section 4 provides implementation details.

Section 5 presents experimental results. Section 6 concludes the paper with guidelines for future work.

2. RELATED WORK

This section focuses on cloud and big data eco-system in distributed environment. It reviews literature on

the related topics. Wang et al. [1] studied the utility of MapReduce programming paradigm across

different data centers in distributed environment. They used two big data example for processing. They

include the big data associated with Large Hardon Collidor (LHC) and High Energy Physics (HEP). They

extended Hadoop and named it as G-Hadoop which exploits multiple data centers. Zhao et al. [2]

exploited G-Hadoop with cryptography and SSL for secure big data processing across different data

centers. It also simplified job scheduling and authentication procedures.

Xavier et al. [3] investigated on different virtualization system that is used for MapReduce clusters.

Virtualization is the technology used to leverage utility of computing resources. They focused on

container based virtualization as it prevails in the real world MapReduce frameworks. They found high

performance with Linux Container (LXC). Katal et al. [4] on the other hand studied the need for big data

processing is MapReduce programming paradigm. They also specified issues and challenges related to

big data processing. They described existing big data projects related to Big Science, Government, Private

Sector, and international development. With respect to data analytics, they found challenges such as

volume, storage, analysis, significance, best practices, technical challenges and skills needed. Karande et

al. [5] studied the advantages of Hadoop cluster optimizations for big data analytics with high

performance. They used Amazon S3 for storage and Elastic MapReduce (EMR) for processing big data

using MapReduce programming paradigm.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 31

Page 4: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Fernandez et al. [6] explored the term big data, cloud computing, distributed programming frameworks

and the usage of MapReduce. They proposed a big data framework that can help in working with big data

in terms of data mining and extracting business intelligence. Vavilapalli et al. [7] studied resource

navigator named YARN which is associated with Apache Hadoop. It decouples programming and

resource management for more effective processing of big data. In fact, YARN is Hadoop’s compute

platform which delegates many scheduling functions in order to bring about effectiveness in the

programming model. Kumar et al. [8] focused on the K-Means algorithm execution with Hadoop cluster

for actually verifying and validating MapReduce functionality in distributed environment. It does mean

that traditional K-Means algorithm is executed in parallel processing environment with MapReduce

programming.

Grolinger et al. [9] studied various challenges that arise when big data is processed in MapRedce

programming frameworks. The main challenges they identified are data storage, analytics, online

processing, privacy and security issues. Kambatla et al. [10] threw light into various trends in big data

analytics. The trends are examined in terms of hardware platforms to have data analytics, virtualization

technologies, software stack for analytics applications, and application scope in the emerging

applications. Miller et al. [11] investigate on open source frameworks for big data analytics. They found

the frameworks such as Apache Hadoop, YARN, GPS, Pregel, and Apache Spark. Scala-based

frameworks found by them include Spark, Kafka and Samza, and Scalation. Mythreyee et al. [12] studied

the relationship between cloud and big data processing.

3. PROPOSED SOLUTION

The proposed framework facilitates different users such as end user and administrator to have different

operations to be performed. Architecture diagram shows the relationship between different components of

system. This diagram is very important to understand the overall concept of system. Architecture

diagram is a diagram of a system, in which the principal parts or functions are represented by blocks

connected by lines that show the relationships of the blocks. Different cloud platforms are used by end

users and they give reviews online.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 32

Page 5: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Figure 2: Overview of the system architecture

As presented in Figure 2, it is evident that there are many cloud service providers who can render services

to users. When users operate on cloud and provide reviews to opine on the services, the data can be used

to mine and come up with valuable information.

3.1 Cloud Platforms Review Data Analysis Algorithm

The proposed algorithm is used for analyzing user reviews on the performance of different platforms.

Algorithm: Cloud Platforms Review Data Analysis (CP-RDA)

Inputs : Review Dataset D, aspects A

Output : Aspect based clusters

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 33

Page 6: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Making Document Corpus

1. Start

2. Initialize document corpus vector D

3. Initialize matrix of vectors V

4. Initialize aspect based vector G

5. For each instance d in D

6. Extract review into a document d’

7. Add d’ to D’

8. End For

Pre-processing

9. For each document d’ in D’

10. Perform stop words on d’

11. Perform stemming on d’ using Porter Stemming algorithm

12. End For

Generating TF/IDF Matrix of Vectors

13. For each document d’ in D’

14. For each word w in d’

15. Generate TF/IDF matrix as vector v

16. End For

17. Add v to V

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 34

Page 7: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

18. End For

Finding Aspect Based Analysis

19. Use K-Means for making clusters based on A

20. Use cosine similarity to group vectors in V to form document clusters

21. Group associated clusters to G

22. Return G

23. Stop

Algorithm 1: Cloud Platforms Review Data Analysis Algorithm

As presented in Algorithm 1, it is evident that the algorithm takes dataset as input along with the aspects.

Based on the aspects, it will make clusters of documents. Before that the data is subjected to pre-

processing. The final results provide aspect based report of the cloud platforms.

4. IMPLEMENTATION DETAILS

We have implemnted the proposed system as a web based application using Java platform. It makes use

of Servlets and JSP technologis in order to realize the application. It is an interative application that

provides interface for two users like user and admin.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 35

Page 8: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Figure 3: Entering the Registration Form with Details

As shown in Figure 3, it is found that user is able to provide detaisl for the registration process. Then the

data is submitted to server. The registration process allows users to have a request to cloud server. The

administrator needs to authorize the users in order to have their funcitonlaities done.

Figure 4: Admin Login Home Page

As presented in Figure 4, there is home page found after due authentication process for administrator user.

The admin user has privileges that can help him to perform various activities. The operations enable the

admin to have control over the operations in the application.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 36

Page 9: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Figure 5: Authorization of Users

As shown in Figure 5 there are many authenticated users who are waiting for authorization from the

administrator. The users of the application are authorized by the administrator. Once authorization is

completed, the users can perform their operations after due authentication.

Figure 6: View Other Comments on Search Product

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 37

Page 10: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

As presented in Figure 6, the comments or reviews of the users can be seen here. The reviews later can

help in performing analysis to find the aspects of different cloud platforms. The reviews given by the

users can be viewed. These reviews are later on used to have analysis to know the performance of

different cloud platforms. The analysis is based on the content of reviews given by users.

5. EXPERIMENTAL RESULTS

This section provides results of experiments. The performance of different cloud platforms is analyzed

and observations are presented.

Cloud

Efficiency Time Complexity Cost

Amazon Cloud 7 4 9

Google Cloud 10 6 1.5

Table 1: Shows results of experiments

As presented in Table 1, it is evident that the time complexity and cloud efficiency are observed and

tabulated besides the cost.

Figure 7: Cloud platform efficiency details

As presented in Figure 7, the cloud platforms are analyzed and the results are presented. The horizontal

axis shows the cloud platforms while the vertical axis shows ranking or rating given according to user

reviews.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 38

Page 11: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

YEAR 2016 2017 2018 2019 2020 2021

Revenue

in $ bn $22 $28 $35 $40 $47 $55

Table 2: Cloud services market details

As shown in Table 2, the revenue details of cloud services market are provided for different years from

2016 to 2021.

Figure 8: Cloud computing as a service revenue

As shown in Figure 8, the cloud computing as a service revenues are projected up to year 2021. This

shows increase in the revenues in each year. Horizontal axis shows years while vertical axis shows

revenues.

Cloud Service Market Share

Amazon 35%

Microsoft 13%

IBM 7%

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 39

Page 12: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

Google 5%

Next 10 18%

Rest Of Market 25%

Table 3: Shows market share of cloud platforms

As presented in Table 3, there is data related too market share of cloud platforms. It shows different cloud

platforms and how they are performing in terms of market share.

Figure 9: Market share of cloud platforms

As presented in Figure 9, different cloud platforms are considered for analyzing their market share. The

Amazon cloud platform market share is highest while the Google’s is the least.

6. CONCLUSION AND FUTURE WORK

In this paper, we proposed a framework that is interactive in nature. Users can register with the

application and can provide reviews on different cloud platforms. Thus the proposed application provides

a platform for the users to share their opinions on different cloud services and platforms. This enables

users to provide their reviews. These reviews are valuable for understanding different clouds in terms of

their cost patterns, security, ease of use and other aspects. In order to analyze the reviews, we built an

algorithm which is evaluated with an application. The empirical results revealed that the proposed

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 40

Page 13: Review Based Analysis on Performance of Cloud …journalstd.com/gallery/4-oct2019.pdfReview Based Analysis on Performance of Cloud Platforms 1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna

application is useful and help in finding which cloud platform is better for a given aspect. In future, we

improve the framework to have recommendations as well.

REFERENCES

[1]. Lizhe Wanga, Jie Taoc, Rajiv Ranjan d, Holger Martenc, Achim Streit c, Jingying Chene and Dan

Chena. (2013). G-Hadoop: MapReduce across distributed data centers for data-intensive

computing. IEEE, p1-14.

[2]Jiaqi Zhaoa, Lizhe Wangb, Jie Taoc, Jinjun Chend, Weiye Sunc, Rajiv Ranjane, Joanna Kołodziejf,

Achim Streitc and Dimitrios Georgakopoulose. (2014). A security framework in G-Hadoop for

big data computing across distributed Cloud data centers. Journal of Computer and System

Sciences, p1-14.

[3]. Miguel G. Xavier, Marcelo V. Neves and Cesar A. F. De Rose. (2014). A Performance Comparison

of Container-Based Virtualization Systems for MapReduce Clusters. ACM, p1-9.

[4]. Avita Katal, Mohammad Wazid and R H Goudar. (2104). Big Data: Issues, Challenges, Tools and

Good Practice. IEEE, p1-6.

[5]. Yanish Pradhananga,Shridevi Karande and Chandraprakash Karande. (2016). High Performance

Analytics of Bigdata with Dynamic and Optimized Hadoop Cluster. ISBN, p1-7.

[6]. Alberto Fernandez, Sara del Rio, Victoria Lopez,2 Abdullah Bawakid,3 Maria J. del Jesus, Jose M.

Benítez and Francisco Herrera. (2014). Big Data with Cloud Computing: an insight on the

computing environment, MapReduce, and programming frameworks. ACM, p1-31.

[7]. Vinod Kumar Vavilapallih, Arun C Murthyh, Chris Douglasm, Sharad Agarwali ,Mahadev Konarh,

Robert Evansy, Thomas Gravesy, Jason Lowey, Hitesh Shahh, Siddharth Sethh ,Bikas Sahah,

Carlo Curinom and Owen O’Malleyh San. (2013). Apache Hadoop YARN: Yet Another

Resource Negotiator. ACM, p1-p16.

[8]. Amresh Kumar,Kiran M.,Saikat Mukherjee and Ravi Prakash G. (2013). Verification and Validation

of MapReduce Program model for Parallel K-Means algorithm on Hadoop Cluster. International

Journal of Computer Applications. 72, p1-p8.

[9]. Katarina Grolinger, Michael Hayes, Wilson A. Higashino, Alexandra L'Heureux, David S. Allison

and Miriam A.M. Capretz. (2014). Challenges for MapReduce in Big Data. IEEE, p1-p10.

[10]. Karthik Kambatlaa, Giorgos Kollias b, Vipin Kumarc and Ananth Gramaa. (2014). Trends in big

data analytics. IEEE, p1-13.

[11]. John A. Miller, Casey Bowman, Vishnu Gowda Harish and Shannon Quinn. (2016). Open Source

Big Data Analytics Frameworks Written in Scala. IEEE. 1-5.

[12]. Mythreyee S,Poornima Purohit and Apoorva D.R. (2017). A Study on Use of Big Data in Cloud

Computing Environment. IJARIIT. p1-7.

Science, Technology and Development

Volume VIII Issue X OCTOBER 2019

ISSN : 0950-0707

Page No : 41