big data threat detection on cloud environment with business i … · 2018-09-01 · big data...
TRANSCRIPT
Big Data Threat Detection on Cloud Environment
with Business Intelligence
[1]Dr.D.NageswaraRao,
[2] P.Hiranmanibala
,[3]G S PradeepGhantasala
[1]Professor,
[2]Assistant Professor,
[3]Assistant Professor
[1,3]Galgotias University
Abstract
Cloud computing includes security issues such as information security, computer
security, data privacy and network security which are emerging at a rapid pace. Big
data require processing the data distributed through numerous servers. It develops
large quantity of data that would be available in the cloud environment. The
progressive development of big data and also increased of threats to the information
security. This study based on the significant area of business intelligence and analytics
has emerged for both researchers and practitioners, to be solved in present day
business organization that reflecting the magnitude and impact of data related
problems. In this paper to identify threat for business on big data through cloud
environment and analyze the challenges and merits it brings to enterprises.
Keywords:
Big Data, Cloud Computing, Threat detection, business Intelligence and analytics,
Data Privacy, Security
1. Introduction
Nowadays researcher of big data have focus on the research activities based on
management and analysis of big data which define the approach of zero latency in
International Journal of Pure and Applied MathematicsVolume 119 No. 18 2018, 1789-1799ISSN: 1314-3395 (on-line version)url: http://www.acadpubl.eu/hub/Special Issue http://www.acadpubl.eu/hub/
1789
data analysis and also prevention of data loss using big data security and privacy.
According to various requirements, security and privacy are the two significant role is
used. In big data security generally refer to the benefit of big data which implement
solutions by increasing security, safety and reliability of the distributed system.
Whereas big data privacy aim is to protect the big data form unwanted interference
and unauthorized usage.
Business Intelligence (BI) is said to be practice of collecting data, integrating,
analyzing and presenting of business information with some kind of application and
technologies. The major objective of BI is to backing better and quick in business
decision making. In order to develop the operations of business and understand the
information which help to backing decision making. During recent years, big data
analysis and cloud computing are the two most significant technologies to introduce
the main stream business. In term of delivering a benefit for business and powerful
results, the two major technologies have introduced together. One of the major data
analysis methodologies is bigdata which enables by recent development in IT sector.
Cloud computing have already changed the approach of IT service that provided to an
organizations based upon user and business interact with resources. Thus big data
analysis need large amount of computing resources creating cost adoption of big data
methodology is not economical for several small and medium enterprises. According
to the cloud computing integration, risk is considered based on security in which
nearly 75% of IT experts acknowledge, security is the major risk involved [1]. With
cloud computing, data is stored and accessed through the internet. For some
businesses, it is essential to keep the data on the premises as their data is confidential.
There are several solutions to secure data including encryption, but it is the
organization’s responsibility to encrypt the data appropriately on the cloud. Although
the virtualization process is essential in any cloud technology, it might cause highly
technical security breaches as the data will be stored forever on virtual hardware even
when its index is deleted [2].
International Journal of Pure and Applied Mathematics Special Issue
1790
This proposed paper focus on security challenges, when the confidential or
sensible data process from the organization to the warehouse of big data like apache
Hadoop which provide various threat model and also framework of security control to
address and check the risk because of recognized security threats [3].
2. Literature Survey
Analysis of big data management is the goal of focusing the several researches in day
today activities. It defining the methodology of zero latency to data analytics and the
challenges of privacy and security based big data issues. In certain there are two
essential issues of privacy and security, while the distinct requirement. Security on big
data is used to implement the solutions which are consistency, increasing security and
dispersed of safety system. Instead of privacy on big data that emphases from
unwanted implication and unauthorized user on the protection of big data.
This paper proposes the big data on business intelligence [4] is the new challenges of
security and privacy of big data based on the analysis of zero-latency and notions of
full data[5]. These paper [6] presents the several methods of analyzing the big data
problems that handling via Hadoop distributed file system (HDFS) on the MapReduce
background. In this proposal implementation of MapReduce techniques using HDFS
for big data. Yadvav [7] proposes the algorithm and overview of architecture which is
used in large data sets. To implement the big data, these algorithms describe the
various tools that were established for analyzing them. And also analyzing the many
security issues, trends and application which are followed by using the large data set.
The cloud BI is the innovative concept using the cloud based architecture of
distributing as a service of BI capabilities, that comes at lesser cost until has quicker
flexibility and deployment [8].The delivery model for BI is software as a service BI
(SaaS BI) with a secure Internet connection in which the application are normally
organized at hosted location on outside of the company’s firewall and it is accessed by
an end user [9]. Cloud affects many of the present businesses which including the
major impact on the BI industry. It helps the way of analyzing their data, making
International Journal of Pure and Applied Mathematics Special Issue
1791
better decisions and could help them for turning it into valued business information
for the organizations [10].
The targeted attack of an advanced persistent threat (APT) is against a physical system
or a high value asset. It operates in low and slow mode on the contrast to mass
spreading malware like viruses, Trojons and worms. The networks maintain low
profile is called low mode and allow for low execution time is called slow mode. APT
attackers are to avoid the triggering alarms which are user credentials on often
leverage stolen or zero day exploits. This kind of attack extended over period of time
while the victim organization can take place to the intrusion that remains oblivious.
The 2010 verizon data breach investigation report concludes in 86% of cases was
recorded in the business logs and evidence about the data breach, but the detection
mechanisms is failed to the raise of security alarms [11].
3. Methodology
3.1 Big Data Threat detection Framework
Cloud Secure Alliance (CSA) is a best practice of non-profit organization to support
the use for providing the assurance of security within the cloud environment. It has
focused on the most important challenges which have created by big data groups to
implementing the securing big data services. There are four different traits which are
categorized the dissimilar privacy and security challenges of big data ecosystem.
These traits are data privacy, infrastructure security, data management and integrity
and reactive security. It offers the various threat models and security control model to
identify the threats for address and mitigate the risk. In this paper, the detailed
description of Hadoop system that identifies the various security weakness of such
system. And also concludes the reference security framework which analysis by
providing for an enterprise on cloud environment [12].
International Journal of Pure and Applied Mathematics Special Issue
1792
Figure 1: CSA- classification of the Top 10 Challenge
3.1.1 Infrastructure Security & Integrity
In the past three years, the common vulnerabilities and exposures (CVE) database
illustrate 4 commentary and secure Hadoop Vulnerabilities. This might either most the
most of vulnerability remediation happens with no public reporting internally within
the vendor environments with no public reporting themselves or it is not active that
reflect the security community. Security configuration files of Hadoop with no
validity prior are self-contained to such policies being organized. This outcome
usually in availability and data integrity issues.
Distributed Programming structures for secure computations
The structure of distributed programming is used for storage and computation to
process the huge amount of data. MapReduce framework is the best example in which
the separation of an input files into multiple portions. The first phase of MapReduce is
Mapper for respective portion that reads the data, executes some calculations and
output of a key/value list pairs. In the second phase Reducer is used to combine to
International Journal of Pure and Applied Mathematics Special Issue
1793
each distinct key in which the values belonging for the distinct key and outputs. There
are two main prevention measures to attack, one is securing the data and securing the
mappers in the occurrence of an untrusted mappers [12].
Best practices for Non-relational data stores for security
NoSQL database are quite developing with respect to the infrastructure security which
are popularized of non-relational data stores. For example NoSQL injection which is
not mature for strong solutions and at any point of its designing stage, security was
never part of the model. Designers using NoSQL DB typically the security is implant
in the middleware. It does not support for providing explicitly to enforcing the
database. Though, the additional challenges of NoSQL DB pose to robust of such
security practices based on clustering aspect.
End-Point Input Validation/Filtering
Several big data use cases from many sources to require data collection in enterprise
settings such as endpoint devices. For instance, an enterprise network, Security
Information and Event Management system (SIEM) may gather event logs from lots
of software applications and hardware devices. An important challenge in the data
gathering process is input authentication. Input validation and filtering is an
intimidating task modeled by untrusted input sources, particularly with the bring your
own device (BYOD) model.
Security and Compliance of Real-time Monitoring
Real-time security monitoring has continually been a task, given the number of alarms
caused by security devices. These alarms lead to several false positives, which are
typically overlooked or just “clicked away,” as persons can’t manage with the trim
amount. This difficult influence even rise with big data, given the velocity and
volume of data cricks. Though, big data technologies may also offer a chance, in the
sense that these technologies do permit for fast handling and data analytics of
dissimilar kinds of data. Which in its chance can be used to offer, for example, real-
time anomaly detection based on accessible security analytics.
International Journal of Pure and Applied Mathematics Special Issue
1794
3.1.2 Identity & Access Management
Access Control Lists (ACLs) and Role Based Access Control (RBAC) policy archives
for mechanisms like HBase and MapReduce are typically arranged through clear-text
archives. These archives are editable by restricted versions on the scheme like origin
and other application accounts.
Securing of Data Storage and Transactions Logs
The storage media of Data and transaction logs are kept in multi-tiered. Manually
affecting data between levels that provides the IT manager through mechanism over
precisely what data is stimulated and when. Though, as the extent of data set has
been, and sustains to be, growing exponentially, availability and scalability have
required autotyping for big data storing organization. Auto-tiering results don’t retain
way of where the information is stored, which poses new trials to protected data
storage. New appliances are authoritative to prevent unauthorized access and sustain
the 24/7 availability.
Granular Audits
When the attack occur with a real time security monitoring the moment which have
try to notify but during reality it will not be always happened. The audit information is
essential for the reason of bottom most missed attack, this audit information will help
to understand the happened situation and also study were this went wrong in order to
correct the compliance, regulation and forensic reason. Based on this reason auditing
is not a new process but in the case of granularity scope might be varies. When it deals
with several data objects that possibly are distributed.
Data Provenance
The Provenance have produced large provenance graph which is complex, so
provenance metadata will grow by enable the programming environment in the
application of big data. In order to detect metadata dependencies for security and
privacy application this is comprehensively computational by using the analysis of
large provenance graphs.
International Journal of Pure and Applied Mathematics Special Issue
1795
3.1.3 Data Privacy & Security
All problems related with SQL injection type of intrusion also get forwarded to
component of hadoop such as Impala and Hive. The prepared functions in SQL are
presently not accessible that would have allowed in separation of both data and query.
In the case of sensitivity data protection, there is a lack of native cryptographic
control. Often this kind of security is provided for application stack or outside the
data. While transferring the data from one node to another node plain text data will be
send. So location of the data can’t be strictly imposed and even scheduler may not
able to find the next resource to the data which is forced to read the data in the
network.
Scalable and preservation Privacy
This is used for securing analytics and data mining, where big data has become a
troubling sign of authoritarian by probably enabling assault of privacy, assault
marketing, increment in control of state and corporate and also reducing civil freedom.
In an organization, this recent technologies of analyzing made an advantage in data
analytics for identified the marketing purpose and also unidentified data for analytics
is not adequate to preserve user privacy.
Cryptographically Enforced Access Control and Secure Communication
This is one of the main metrics to ensure the confidential and sensitive personal
information is to be secure end-to-end which can be accessible for only authorized
entities based on access control policies were the data has encrypted. In this area,
particular research is Attribute Based Encryption (ABE) has made better, scalable and
also high efficient. The cryptographically secure data framework has been
implementing in order to ensure authentication and agreement among the distributed
entities.
Granular Access Control
As per the view of access control in which used as security property is a secrecy
prevention access data by the person which shouldn’t have access. In course grained
International Journal of Pure and Applied Mathematics Special Issue
1796
access method has the problem were the data could otherwise be shared is frequently
swept into the category of more preventive in guarantee of better security. Granular
access control provides the data manager with knife in behalf of sword to share
information as more as probable without compromise in secrecy.
4. Conclusion
Data analytics provide long usage for business in order to help in directing strategy to
improve profit and also by supporting in the process of decision making. Nowadays
the big data methodology and cloud computing technology are widely used in the
organization to shape up the business. It is comprehensive that big data provide
interesting opportunities for both users and business, so these are countered with huge
challenge based on security and privacy. Traditional security are lacking in proving
the accomplished solution to those challenges. The proposed alliance of cloud service
framework have introduced to solve the provided ten data security and privacy issues
that is addressed for creating big data process and computing BI with more secure.
Reference
[1] Marinela MIRCEA, Bogdan GHILIC – MICU and Marian STOICA, 2011,
“COMBINING BUSINESS INTELLIGENCE WITH CLOUD COMPUTING
TO DELIVERY AGILITY IN ACTUAL ECONOMY”, Economic
Computation & Economic Cybernetics Studies & Research, Vol. 45 Issue 1,
p1.
[2] Christina Tamer, Mary Kiley, Noushin Ashrafi & Jean- Pierre Kuilboer,
“RISKS AND BENEFITS OF BUSINESS INTELLIGENCE IN THE
CLOUD”, University of Massachusetts Boston, Management Science and
Information Systems Department, 100 Morrissey Blvd, Boston, MA, 02125
United States of America.
[3] Ajit Gaddam , “Securing Your Big Data Environment” .
International Journal of Pure and Applied Mathematics Special Issue
1797
[4] E. Damiani et al.: Business intelligence meets big data: a manifesto. In: Proc.
of the 3rd International Symposium on Data-Driven Process Discovery and
Analysis (post proceedings). Riva del Garda, Italy (August 2013).
[5] Claudio A. Ardagna and Ernesto Damiani , “Business Intelligence meets Big
Data: An Overview on Security and Privacy”.
[6] Agarwal, D., Das, S. and Abbadi, A. (2011). Big Data and Cloud Computing:
Current State and Future Opportunities. ACM 978-1-4503-0528-0/11/0003.
[7] Talia, D. (2013). Clouds for Scalable Big Data Analytics. Published by IEEE
Computer Society.
[8] Yuvraj Singh Gurjar & Vijay Singh Rathore, 2013, “Cloud Business
Intelligence – Is What Business Need Today”, International Journal of Recent
Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6.
[9] “Software as a Service BI (SaaS BI)”,
http://searchbusinessanalytics.techtarget.com/definition/S oftware-as-a-
Service-BI-SaaS-BI, Requested in February 2014.
[10 ]“Business Intelligence in the Cloud”,
http://cloudcomputingtopics.com/2011/09/business- intelligence-in-the-cloud/,
Requested in December 2013.
[11] Big Data Analytics for Security Intelligence
[12] Top Ten Big Data Security and Privacy Challenges
International Journal of Pure and Applied Mathematics Special Issue
1798