review based analysis on performance of cloud …journalstd.com/gallery/4-oct2019.pdfreview based...
Post on 20-May-2020
2 Views
Preview:
TRANSCRIPT
Review Based Analysis on Performance of
Cloud Platforms
1G.Kavitha , 2Shobini.B and 3G.Shiva Krishna
1 Computer science and Engineering ,Swathi Institute of Technology & Sciences Near Ramoji Film City
Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512
2 Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City
Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512
3Computer science and Engineering, Swathi Institute of Technology & Sciences Near Ramoji Film City
Beside Kothagudem 'X' Roads, Hyderabad, Telangana 501512
gidugupatikavitha1@gmail.com ,shobini.b@gmail.com ,Shivagujju@gmail.com
Abstract
Cloud has emerged as a platform for outsourcing data and computations. Cloud Service Provider (CSP)
maintains cloud and gives services to cloud consumers. There will be agreements between provider and
consumer with respect to different services. Different cloud platforms are available in the real world.
They have specific services and deployment models that are widely used. As the cloud services are
affordable, outsourcing data and computing became a common phenomenon for organizations. In this
context, there are many aspects like cost, security, ease of use and other parameters that can help users to
make well informed decisions. To analyze performance of cloud platforms, we developed a framework in
this paper. The framework helps users and administrators to interact with the system. Users can provide
reviews on various cloud platforms. We proposed an algorithm for analyzing reviews and come up with
useful information. The algorithm is evaluated with a prototype and the results showed the usefulness of
the proposed system.
Keywords – Cloud computing, cloud performance, user reviews, clustering, business intelligence
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 29
1. INTRODUCTION
Cloud computing services have become the paradigm of large scale infrastructure where a third
party provides shared virtual computing and storage resources to a client. The use of the third party
infrastructure translates into cost reductions for the client who does not invest in infrastructure and
maintenance. However, the negotiation of Service Level Agreements (SLAs) in Infrastructure-as-a-
Service (IaaS) in the cloud remains a challenging problem. One of the main difficulties of guaranteeing
performance in the cloud is the inherent randomness produced by the massive amount of time-varying
interactions and events taking place in the system. However, it is possible to optimize different
performance measures on the average based on the amount and size of the available virtual resources.
Figure 1: Benefits of cloud computing
As shown in Figure 1, there are many benefits of cloud computing that makes it an attractive
solution to enterprises in the real world. It increases productivity as it bestows pre-configured
environments, software, hardware, servers and a host of other computing resources. It provides a low cost
solution to computing and storage as it is based on virtualization that makes cloud computing affordable.
Flexibility is another advantage which enables elasticity in providing computing resources based on the
demand of users. It helps in collaboration with other partners in research and other activities. Business
collaborations make the growth of organizations easier. Maintaining software and updating the same
BENEFITS OF
CLOUD COMPUTING
LOWCOST
ow cost
MOBILITY
DATALOST
PREVENTION
SECURITY
PRODUCTIVITY
COLLABORATION
FLEXIBILITY
SOFTWARE UPDATE
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 30
becomes easier as it avoids traditional cumbersome work of installations at client side and maintenance.
Cloud computing enables mobility as the users can access it from anywhere in the world. It also helps in
getting services from different vendors. Cloud computing, as claimed by providers, has its mechanisms to
prevent data loss and provide secure means of accessing data. Nevertheless, cloud servers are considered
untrusted from user point of view.
Although many approaches have been proposed to overcome this problem, commercial clouds have not
been able to implement efficient systems where users pay for specific performance measures such as CPU
and memory utilization rather than a flat hourly rate service. In commercial IaaS, such as the one
provided by Amazon Web Services, Amazon Elastic Compute Cloud (AWS EC2), the set of possible
configurations only allows for coarse-grained variations of the control inputs of the system. This imposes
a challenge to apply performance regulation using existing techniques based on model identification and
control systems theory.
In the wake of online reviews on cloud platforms, it is essential to have mechanisms to process reviews
and help users to make well informed decisions. Towards this end, in this paper, we proposed a
framework for analyzing reviews of cloud computing platforms given by users. These reviews are
processed in order to find aspect based information related to different aspects like price, security and so
on. The contributions of this paper are as follows.
1. We proposed a framework for mining online reviews on performance of different cloud
platforms.
2. We proposed an algorithm known as Cloud Platforms Review Data Analysis (CP-RDA) to gain
intelligence on the performance of cloudplatforms based on user reviews.
3. An application is built to evaluate the proposed algorithm and provide aspect based analysis of
cloud performance.
The remainder of the paper is structured as follows. Section 2 presents review of literature. Section 3
presents the proposed framework and underlying algorithm. Section 4 provides implementation details.
Section 5 presents experimental results. Section 6 concludes the paper with guidelines for future work.
2. RELATED WORK
This section focuses on cloud and big data eco-system in distributed environment. It reviews literature on
the related topics. Wang et al. [1] studied the utility of MapReduce programming paradigm across
different data centers in distributed environment. They used two big data example for processing. They
include the big data associated with Large Hardon Collidor (LHC) and High Energy Physics (HEP). They
extended Hadoop and named it as G-Hadoop which exploits multiple data centers. Zhao et al. [2]
exploited G-Hadoop with cryptography and SSL for secure big data processing across different data
centers. It also simplified job scheduling and authentication procedures.
Xavier et al. [3] investigated on different virtualization system that is used for MapReduce clusters.
Virtualization is the technology used to leverage utility of computing resources. They focused on
container based virtualization as it prevails in the real world MapReduce frameworks. They found high
performance with Linux Container (LXC). Katal et al. [4] on the other hand studied the need for big data
processing is MapReduce programming paradigm. They also specified issues and challenges related to
big data processing. They described existing big data projects related to Big Science, Government, Private
Sector, and international development. With respect to data analytics, they found challenges such as
volume, storage, analysis, significance, best practices, technical challenges and skills needed. Karande et
al. [5] studied the advantages of Hadoop cluster optimizations for big data analytics with high
performance. They used Amazon S3 for storage and Elastic MapReduce (EMR) for processing big data
using MapReduce programming paradigm.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 31
Fernandez et al. [6] explored the term big data, cloud computing, distributed programming frameworks
and the usage of MapReduce. They proposed a big data framework that can help in working with big data
in terms of data mining and extracting business intelligence. Vavilapalli et al. [7] studied resource
navigator named YARN which is associated with Apache Hadoop. It decouples programming and
resource management for more effective processing of big data. In fact, YARN is Hadoop’s compute
platform which delegates many scheduling functions in order to bring about effectiveness in the
programming model. Kumar et al. [8] focused on the K-Means algorithm execution with Hadoop cluster
for actually verifying and validating MapReduce functionality in distributed environment. It does mean
that traditional K-Means algorithm is executed in parallel processing environment with MapReduce
programming.
Grolinger et al. [9] studied various challenges that arise when big data is processed in MapRedce
programming frameworks. The main challenges they identified are data storage, analytics, online
processing, privacy and security issues. Kambatla et al. [10] threw light into various trends in big data
analytics. The trends are examined in terms of hardware platforms to have data analytics, virtualization
technologies, software stack for analytics applications, and application scope in the emerging
applications. Miller et al. [11] investigate on open source frameworks for big data analytics. They found
the frameworks such as Apache Hadoop, YARN, GPS, Pregel, and Apache Spark. Scala-based
frameworks found by them include Spark, Kafka and Samza, and Scalation. Mythreyee et al. [12] studied
the relationship between cloud and big data processing.
3. PROPOSED SOLUTION
The proposed framework facilitates different users such as end user and administrator to have different
operations to be performed. Architecture diagram shows the relationship between different components of
system. This diagram is very important to understand the overall concept of system. Architecture
diagram is a diagram of a system, in which the principal parts or functions are represented by blocks
connected by lines that show the relationships of the blocks. Different cloud platforms are used by end
users and they give reviews online.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 32
Figure 2: Overview of the system architecture
As presented in Figure 2, it is evident that there are many cloud service providers who can render services
to users. When users operate on cloud and provide reviews to opine on the services, the data can be used
to mine and come up with valuable information.
3.1 Cloud Platforms Review Data Analysis Algorithm
The proposed algorithm is used for analyzing user reviews on the performance of different platforms.
Algorithm: Cloud Platforms Review Data Analysis (CP-RDA)
Inputs : Review Dataset D, aspects A
Output : Aspect based clusters
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 33
Making Document Corpus
1. Start
2. Initialize document corpus vector D
3. Initialize matrix of vectors V
4. Initialize aspect based vector G
5. For each instance d in D
6. Extract review into a document d’
7. Add d’ to D’
8. End For
Pre-processing
9. For each document d’ in D’
10. Perform stop words on d’
11. Perform stemming on d’ using Porter Stemming algorithm
12. End For
Generating TF/IDF Matrix of Vectors
13. For each document d’ in D’
14. For each word w in d’
15. Generate TF/IDF matrix as vector v
16. End For
17. Add v to V
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 34
18. End For
Finding Aspect Based Analysis
19. Use K-Means for making clusters based on A
20. Use cosine similarity to group vectors in V to form document clusters
21. Group associated clusters to G
22. Return G
23. Stop
Algorithm 1: Cloud Platforms Review Data Analysis Algorithm
As presented in Algorithm 1, it is evident that the algorithm takes dataset as input along with the aspects.
Based on the aspects, it will make clusters of documents. Before that the data is subjected to pre-
processing. The final results provide aspect based report of the cloud platforms.
4. IMPLEMENTATION DETAILS
We have implemnted the proposed system as a web based application using Java platform. It makes use
of Servlets and JSP technologis in order to realize the application. It is an interative application that
provides interface for two users like user and admin.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 35
Figure 3: Entering the Registration Form with Details
As shown in Figure 3, it is found that user is able to provide detaisl for the registration process. Then the
data is submitted to server. The registration process allows users to have a request to cloud server. The
administrator needs to authorize the users in order to have their funcitonlaities done.
Figure 4: Admin Login Home Page
As presented in Figure 4, there is home page found after due authentication process for administrator user.
The admin user has privileges that can help him to perform various activities. The operations enable the
admin to have control over the operations in the application.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 36
Figure 5: Authorization of Users
As shown in Figure 5 there are many authenticated users who are waiting for authorization from the
administrator. The users of the application are authorized by the administrator. Once authorization is
completed, the users can perform their operations after due authentication.
Figure 6: View Other Comments on Search Product
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 37
As presented in Figure 6, the comments or reviews of the users can be seen here. The reviews later can
help in performing analysis to find the aspects of different cloud platforms. The reviews given by the
users can be viewed. These reviews are later on used to have analysis to know the performance of
different cloud platforms. The analysis is based on the content of reviews given by users.
5. EXPERIMENTAL RESULTS
This section provides results of experiments. The performance of different cloud platforms is analyzed
and observations are presented.
Cloud
Efficiency Time Complexity Cost
Amazon Cloud 7 4 9
Google Cloud 10 6 1.5
Table 1: Shows results of experiments
As presented in Table 1, it is evident that the time complexity and cloud efficiency are observed and
tabulated besides the cost.
Figure 7: Cloud platform efficiency details
As presented in Figure 7, the cloud platforms are analyzed and the results are presented. The horizontal
axis shows the cloud platforms while the vertical axis shows ranking or rating given according to user
reviews.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 38
YEAR 2016 2017 2018 2019 2020 2021
Revenue
in $ bn $22 $28 $35 $40 $47 $55
Table 2: Cloud services market details
As shown in Table 2, the revenue details of cloud services market are provided for different years from
2016 to 2021.
Figure 8: Cloud computing as a service revenue
As shown in Figure 8, the cloud computing as a service revenues are projected up to year 2021. This
shows increase in the revenues in each year. Horizontal axis shows years while vertical axis shows
revenues.
Cloud Service Market Share
Amazon 35%
Microsoft 13%
IBM 7%
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 39
Google 5%
Next 10 18%
Rest Of Market 25%
Table 3: Shows market share of cloud platforms
As presented in Table 3, there is data related too market share of cloud platforms. It shows different cloud
platforms and how they are performing in terms of market share.
Figure 9: Market share of cloud platforms
As presented in Figure 9, different cloud platforms are considered for analyzing their market share. The
Amazon cloud platform market share is highest while the Google’s is the least.
6. CONCLUSION AND FUTURE WORK
In this paper, we proposed a framework that is interactive in nature. Users can register with the
application and can provide reviews on different cloud platforms. Thus the proposed application provides
a platform for the users to share their opinions on different cloud services and platforms. This enables
users to provide their reviews. These reviews are valuable for understanding different clouds in terms of
their cost patterns, security, ease of use and other aspects. In order to analyze the reviews, we built an
algorithm which is evaluated with an application. The empirical results revealed that the proposed
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 40
application is useful and help in finding which cloud platform is better for a given aspect. In future, we
improve the framework to have recommendations as well.
REFERENCES
[1]. Lizhe Wanga, Jie Taoc, Rajiv Ranjan d, Holger Martenc, Achim Streit c, Jingying Chene and Dan
Chena. (2013). G-Hadoop: MapReduce across distributed data centers for data-intensive
computing. IEEE, p1-14.
[2]Jiaqi Zhaoa, Lizhe Wangb, Jie Taoc, Jinjun Chend, Weiye Sunc, Rajiv Ranjane, Joanna Kołodziejf,
Achim Streitc and Dimitrios Georgakopoulose. (2014). A security framework in G-Hadoop for
big data computing across distributed Cloud data centers. Journal of Computer and System
Sciences, p1-14.
[3]. Miguel G. Xavier, Marcelo V. Neves and Cesar A. F. De Rose. (2014). A Performance Comparison
of Container-Based Virtualization Systems for MapReduce Clusters. ACM, p1-9.
[4]. Avita Katal, Mohammad Wazid and R H Goudar. (2104). Big Data: Issues, Challenges, Tools and
Good Practice. IEEE, p1-6.
[5]. Yanish Pradhananga,Shridevi Karande and Chandraprakash Karande. (2016). High Performance
Analytics of Bigdata with Dynamic and Optimized Hadoop Cluster. ISBN, p1-7.
[6]. Alberto Fernandez, Sara del Rio, Victoria Lopez,2 Abdullah Bawakid,3 Maria J. del Jesus, Jose M.
Benítez and Francisco Herrera. (2014). Big Data with Cloud Computing: an insight on the
computing environment, MapReduce, and programming frameworks. ACM, p1-31.
[7]. Vinod Kumar Vavilapallih, Arun C Murthyh, Chris Douglasm, Sharad Agarwali ,Mahadev Konarh,
Robert Evansy, Thomas Gravesy, Jason Lowey, Hitesh Shahh, Siddharth Sethh ,Bikas Sahah,
Carlo Curinom and Owen O’Malleyh San. (2013). Apache Hadoop YARN: Yet Another
Resource Negotiator. ACM, p1-p16.
[8]. Amresh Kumar,Kiran M.,Saikat Mukherjee and Ravi Prakash G. (2013). Verification and Validation
of MapReduce Program model for Parallel K-Means algorithm on Hadoop Cluster. International
Journal of Computer Applications. 72, p1-p8.
[9]. Katarina Grolinger, Michael Hayes, Wilson A. Higashino, Alexandra L'Heureux, David S. Allison
and Miriam A.M. Capretz. (2014). Challenges for MapReduce in Big Data. IEEE, p1-p10.
[10]. Karthik Kambatlaa, Giorgos Kollias b, Vipin Kumarc and Ananth Gramaa. (2014). Trends in big
data analytics. IEEE, p1-13.
[11]. John A. Miller, Casey Bowman, Vishnu Gowda Harish and Shannon Quinn. (2016). Open Source
Big Data Analytics Frameworks Written in Scala. IEEE. 1-5.
[12]. Mythreyee S,Poornima Purohit and Apoorva D.R. (2017). A Study on Use of Big Data in Cloud
Computing Environment. IJARIIT. p1-7.
Science, Technology and Development
Volume VIII Issue X OCTOBER 2019
ISSN : 0950-0707
Page No : 41
top related