big data

5
ISSN: 2395-0560 International Research Journal of Innovative Engineering www.irjie.com Volume1, Issue 2 of February 2015 __________________________________________________________________________________________ 2015 ,IRJIE-All Rights Reserved Page -58 BIG DATA SWASTIKAA MOUDGIL, JAGREET KAUR Computer Science and Engineering Department, Chandigarh College Of Engineering and Technology Sector26, Chandigarh, 160019, India, Panjab University Computer Science and Engineering Department, Chandigarh College Of Engineering and Technology Sector26, Chandigarh, 160019, India, Panjab University Abstract -- Big Data is an all encompassing term for any collection of datasets so large and complex that it becomes difficult to process them using traditional data processing applications. It has been defined as per 3V Model (Volume, Velocity, Variety).Big Data has found applications in Science, Government sector, Climate Control, Private sector etc. Currently, Big Data is being worked on by companies like Microsoft under their Microsoft Research(MSR),IBM and Apple in collaboration on Ios application Mobile First and many more. It has increased demand of information specialists in projects like Oracle, Dell, IBM etc .Hence, the management of exponentially increasing data, intelligent use of this heterogeneous data is becoming a prime concern for the complete industrial sector. Keywords 3V Model of Big Data Harrenhausen Conference, Introduction to Hadoop, Mobile First app 1. Introduction Big data can also be defined as "Big data is a large volume unstructured data which cannot be handled by standard database management systems like DBMS, RDBMS or ORDBMS". Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. Big data is a set of techniques and technologies that require new forms of integration to uncover large hidden values from large datasets that are diverse, complex, and of a massive scale. Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The challenges include analysis, capture, search, sharing, storage, transfer etc. Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genomics, complex physics simulations, and in e-Science . The limitations also affect Internet search, finance and business informatics. Big data is difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers. 2. Characteristics In a 2001 research report and related lectures, META Group (now Gartner) analyst Doug Laney defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume, velocity, and variety. Gartner, and now much of the industry, continue to use this "3Vs" model for describing big data. In 2012, Gartner updated its definition as follows: "Big data is high volume, high velocity, and high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. Additionally, a new V "Veracity" is added by some organizations to describe it. Big data can be described by the following characteristics: Volume – The quantity of data that is generated is very important in this context. It is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not. Variety - This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts. This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data. Velocity - The term ‘velocity’ in the context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. Variability - This is a factor which can be a problem for those who analyze the data. This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively. Big Data Analytics consists of 6Cs in the integrated Industry 4.0 and Cyber Physical Systems environment. 6C system that is consist of Connection (sensor and networks), Cloud (computing and data on demand), Cyber (model & memory), Content/context (meaning and correlation), Community (sharing & collaboration), and Customization (personalization and value). 3. Applications 3.1Science and Research The Large Hadron Collider experiments represent about 150 million sensors delivering data 40 million times per second. There are nearly 600 million collisions per second. As a result, only working with less than 0.001% of the sensor stream data, the data flow from all four LHC experiments represents 25 petabytes annual rate before replication.

Upload: irjie

Post on 16-Dec-2015

6 views

Category:

Documents


0 download

DESCRIPTION

Big Data is an all encompassing term for any collection of datasets so large and complex that it becomesdifficult to process them using traditional data processing applications. It has been defined as per 3V Model (Volume,Velocity, Variety).Big Data has found applications in Science, Government sector, Climate Control, Private sector etc.Currently, Big Data is being worked on by companies like Microsoft under their Microsoft Research(MSR),IBM and Applein collaboration on Ios application Mobile First and many more. It has increased demand of information specialists inprojects like Oracle, Dell, IBM etc .Hence, the management of exponentially increasing data, intelligent use of thisheterogeneous data is becoming a prime concern for the complete industrial sector

TRANSCRIPT

  • ISSN: 2395-0560 International Research Journal of Innovative Engineering

    www.irjie.com Volume1, Issue 2 of February 2015

    __________________________________________________________________________________________2015 ,IRJIE-All Rights Reserved Page -58

    BIG DATA SWASTIKAA MOUDGIL, JAGREET KAUR

    Computer Science and Engineering Department, Chandigarh College Of Engineering and Technology Sector26, Chandigarh, 160019, India, Panjab University

    Computer Science and Engineering Department, Chandigarh College Of Engineering and Technology Sector26, Chandigarh, 160019, India, Panjab University

    Abstract -- Big Data is an all encompassing term for any collection of datasets so large and complex that it becomes difficult to process them using traditional data processing applications. It has been defined as per 3V Model (Volume, Velocity, Variety).Big Data has found applications in Science, Government sector, Climate Control, Private sector etc. Currently, Big Data is being worked on by companies like Microsoft under their Microsoft Research(MSR),IBM and Apple in collaboration on Ios application Mobile First and many more. It has increased demand of information specialists in projects like Oracle, Dell, IBM etc .Hence, the management of exponentially increasing data, intelligent use of this heterogeneous data is becoming a prime concern for the complete industrial sector.

    Keywords 3V Model of Big Data Harrenhausen Conference, Introduction to Hadoop, Mobile First app

    1. Introduction

    Big data can also be defined as "Big data is a large volume unstructured data which cannot be handled by standard database management systems like DBMS, RDBMS or ORDBMS". Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate, manage, and process data within a tolerable elapsed time. Big data is a set of techniques and technologies that require new forms of integration to uncover large hidden values from large datasets that are diverse, complex, and of a massive scale. Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process them using traditional data processing applications. The challenges include analysis, capture, search, sharing, storage, transfer etc. Scientists regularly encounter limitations due to large data sets in many areas, including meteorology, genomics, complex physics simulations, and in e-Science . The limitations also affect Internet search, finance and business informatics. Big data is difficult to work with using most relational database management systems and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers.

    2. Characteristics

    In a 2001 research report and related lectures, META Group (now Gartner) analyst Doug Laney defined data growth challenges and opportunities as being three-dimensional, i.e. increasing volume, velocity, and variety. Gartner, and now much of the industry, continue to use this "3Vs" model for describing big data. In 2012, Gartner updated its definition as follows: "Big data is high volume, high velocity, and high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization. Additionally, a new V "Veracity" is added by some organizations to describe it. Big data can be described by the following characteristics: Volume The quantity of data that is generated is very important in this context. It is the size of the data which determines the value and potential of the data under consideration and whether it can actually be considered as Big Data or not. Variety - This means that the category to which Big Data belongs to is also a very essential fact that needs to be known by the data analysts. This helps the people, who are closely analyzing the data and are associated with it, to effectively use the data to their advantage and thus upholding the importance of the Big Data. Velocity - The term velocity in the context refers to the speed of generation of data or how fast the data is generated and processed to meet the demands and the challenges which lie ahead in the path of growth and development. Variability - This is a factor which can be a problem for those who analyze the data. This refers to the inconsistency which can be shown by the data at times, thus hampering the process of being able to handle and manage the data effectively. Big Data Analytics consists of 6Cs in the integrated Industry 4.0 and Cyber Physical Systems environment. 6C system that is consist of Connection (sensor and networks), Cloud (computing and data on demand), Cyber (model & memory), Content/context (meaning and correlation), Community (sharing & collaboration), and Customization (personalization and value).

    3. Applications 3.1Science and Research The Large Hadron Collider experiments represent about 150 million sensors delivering data 40 million times per second. There are nearly 600 million collisions per second. As a result, only working with less than 0.001% of the sensor stream data, the data flow from all four LHC experiments represents 25 petabytes annual rate before replication.

  • ISSN: 2395-0560 International Research Journal of Innovative Engineering

    www.irjie.com Volume1, Issue 2 of February 2015

    __________________________________________________________________________________________2015 ,IRJIE-All Rights Reserved Page -59

    This becomes nearly 200 petabytes after replication. If all sensor data were to be recorded in LHC, the data flow would be extremely hard to work with. The data flow would exceed 150 million petabytes annual rate, or nearly500 exabytes per day, before replication. The Square Kilometer Array is a telescope which consists of millions of antennas and is expected to be operational by 2024. Collectively, these antennas are expected to gather 14 exabytes and store one petabyte per day. It is considered to be one of the most ambitious scientific projects ever undertaken. When the Sloan Digital Sky Survey (SDSS) began collecting astronomical data in 2000, it amassed more in its first few weeks than all data collected in the history of astronomy. Continuing at a rate of about 200 GB per night, SDSS has amassed more than 140 terabytes of information. The NASA Center for Climate Simulation (NCCS) stores 32 petabytes of climate observations and simulations on the Discover supercomputing cluster.

    3.2Government In 2012, the Obama administration announced the Big Data Research and Development Initiative, to explore how big data could be used to address important problems faced by the government. The initiative is composed of 84 different big data programs spread across six departments. Big data analysis was, in parts, responsible for the BJP and its allies to win a highly successful Indian General Election 2014.

    3.3Private sector The Utah Data Center is a data center currently being constructed by the United States National Security Agency. When finished, the facility will be able to handle a large amount of information in exabytes collected by the NSA over the Internet. EBay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the worlds three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. Wal-Mart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes of data. Face book handles 50 billion photos from its user base.

    3.4Big Data Terrorism The recent Sony hacking case is notable because it appears to potentially be the first state-sponsored act of cyber-terrorism where a company has been successfully threatened under the glare of the national media. Ill leave it to the pundits to argue whether Sonys decision to postpone releasing an inane farce was prudent or cowardly. Whats interesting is that the cyber terrorists caused real fear to Sony by publicly releasing internal enterprise data including salaries, email conversations and information about actual movies. Security software companies are investing in big data analytics to help companies better protect against future attacks. FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide.

    3.5Manufacturing The generated big data acts as the input into predictive tools and preventive strategies such as Prognostics and Health Management [PHM]. Current PHM implementations mostly utilize data during the actual usage while analytical algorithms can perform more accurately when more information throughout the machines lifecycle, such as system configuration, physical knowledge and working principles, are included .With such motivation coupled model scheme has been developed . The coupled model is a digital twin of the real machine that operates in the cloud platform and simulates the health condition with an integrated knowledge from both data driven analytical algorithms as well as other available physical knowledge.

    3.6Climate Take Climate Corporation, for instance. Open access to weather data powers the companys insurance products and Internet software, which helps farmers manage risk and optimize their fields. Or take Zillow as another example. The successful real estate media site uses federal and local government data, including satellite photography, tax assessment data and economic statistics to provide potential buyers a more dynamic and informed view of the housing market.

    3.7Personalized Medicine Even as we engage in a vibrant discussion about the need for personal privacy, big data pushes the boundaries of what is possible in health care. Whether we label it precision medicine or personalized medicine, these two aligned trends the digitization of the health care system and the introduction of wearable devices are quietly revolutionizing health and wellness .In the not-too-distant future, doctors will be able to create customized drugs and treatments tailored for your genome, your activity level, and your actual health. Big data analytics has the potential to disrupt the way we practice health care and change the way we think about our wellness.

    3.8Digital Learning, Everywhere Both sides recognize that digital learning, inside and outside the classroom, is an unavoidable trend. From Massive Open Online Courses (MOOCs) to adaptive learning technologies that personalize the delivery of instructional material to the individual student, educational technology thrives on data. From names that you grew up with (McGraw Hill, Houghton

  • ISSN: 2395-0560 International Research Journal of Innovative Engineering

    www.irjie.com Volume1, Issue 2 of February 2015

    __________________________________________________________________________________________2015 ,IRJIE-All Rights Reserved Page -60

    Mifflin, Pearson) to some you didnt (Cengage, Amplify), companies are making bold investments in digital products that do more than just push content online; theyre touting products that fundamentally change how and when students learn and how instructors evaluate individual student progress and aid their development. Now that weve moved past mere adoption to implementation and utilization, 2015 will undoubtedly be big datas break-out year. 4. Market Growth Big data has increased the demand of information management specialists in that Software AG, Oracle Corporation, IBM,FICO, Microsoft, SAP, EMC, HP and Dell have spent more than $15 billion on software firms specializing in data management and analytics. In 2010, this industry was worth more than $100 billion and was growing at almost 10 percent a year: about twice as fast as the software business as a whole. The world's effective capacity to exchange information through telecommunication networks was 281 petabytes in 1986, 471 petabytes in 1993, 2.2 exabytes in 2000, 65 exabytes in 2007 and it is predicted that the amount of traffic flowing over the internet will reach 667 exabytes annually by 2014. It is estimated that one third of the globally stored information is in the form of alphanumeric text and still image data, which is the format most useful for most big data applications. This also shows the potential of yet unused data (i.e. in the form of video and audio content). Data sets grow in size in part because they are increasingly being gathered by ubiquitous information-sensing mobile devices, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification (RFID) readers, and wireless sensor networks. The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s; as of 2012, every day 2.5 exabytes (2.51018) of data were created; as of 2014, every day 2.3 zettabytes(2.31021) of data were created.

    5. Present Scenario Large amounts of data, a variety of sources, high speed production, but also high speed processing - these are the basic characteristics of Big Data. The amount of data that is generated and collected in each second grows exponentially. The management of Big Data, the intelligent use of large, heterogeneous data sets, is becoming increasingly important for competition. It is affecting all sectors - industry and academia but also the public sector. While the economy is exploring Big Data as a new gold mine, politicians are fighting over the problem of data capitalism, whereas science tackles the question of cross-disciplinary benefits, as well as on the challenges and the likely consequences for technology, innovation, and society. As a marketing term or industry description, big data is so omnipresent these days that it doesnt mean much. But it is pretty clear that we are at a tipping point. The global scale of the Internet, the ubiquity of mobile devices, the ever-declining costs of cloud computing and storage, and an increasingly networked physical word create an explosion of data unlike anything weve seen before. Big data is nothing new. In fact, although the official definition which refers to big data in terms of data volume, velocity and variety only came about in 2001, companies have been gathering large amounts of data for decades. But big data has taken on a new lease of life in the last five years, largely as a result of companies finding new ways to analyze data. Experts at GP Bull hound, an investment banking firm, suggest the future of big data in a new report, entitled 'Extracting Insights from Exabytes', Really, big data has moved on from the initial stage where the challenge was about storing the data and has moved onto the next, which is all about the insights companies can obtain from the data.

    5.1On March 25-27, 2015, researchers and international experts meet in Hannover for a Herrenhausen Conference on "Big Data in a Transdisciplinary Perspective. The focus of the Herrenhausen Conference lies on open questions, unsolved problems, and future perspectives. The conference on Big Data therefore will not focus on a particular discipline but provide a transdisciplinary forum for Big Data researchers. We would like to discuss the challenges and consequences of Big Data research for society as well as innovation and technology, address the influence on economics as well as the legal framework and close on the challenges for research and research funding in the field of Big Data. Our goal is to create an inspiring setting for the discussion of new ideas.

    5.2Big Data Talent Hotspots San Francisco Bay Area: Despite high talent cost, San Francisco Bay Area is expected to be the premier destination for Big Data and Analytics talent. The High Tech ecosystem provides a favorable setting for the development and advancement of new skillsets such as Big Data.. The Bay area is home to Big Data teams of reputed global firms like Google, Amazon, Yahoo, Apple, LinkedIn and Face book. The presence of premier research institutions such as Stanford and University of California ensure a steady supply of qualified graduate engineers. They offer intensive programs in the field of Big Data research. Bay area is also a cradle for new age Big Data startups such as Platfora and Adchemy. Bangalore: By 2020, Bangalore is expected to emerge as the second largest destination for Big Data R&D, driven by its fast growing and cost effective talent pool. MNCs such as Amazon, IBM, EMC and E-bay have big data teams operating from Bangalore. Local companies such as TCS, Wipro and Infosys are also building Big Data capabilities to cater to their international clientele. Indian Institute of Science situated at Bangalore is a premier institute involved in cutting edge research in the field of statistics and analytics. In addition, the presence of startups is enriching the big data ecosystem.

  • ISSN: 2395-0560 International Research Journal of Innovative Engineering

    www.irjie.com Volume1, Issue 2 of February 2015

    __________________________________________________________________________________________2015 ,IRJIE-All Rights Reserved Page -61

    Inmobi, a mobile advertisement platform headquartered in Bangalore is building digital solutions for global customers using Big Data. Mu-Sigma Analytics recently valued at over a billion dollars * has built significant analytical capabilities and employs a large number of decision scientists and data scientists. Shanghai: Shanghai is still in its nascent stages as a Big Data R&D hot spot, but is expected to grow rapidly driven by talent and cost benefits. MNCs such as eBay, IBM, HP and Intel have small Big Data teams working on Big Data platforms

    5.3How big data can make cities work for the poor BY AXEL VAN TROTSENBERG ON FEB 2 2015 Big data can be a critically important tool in this exercise, which is the focus of our new report titled East Asias Changing Urban Landscape: Measuring a Decade of Spatial Growth, It uses satellite imagery and geospatial mapping to provide an analytical overview of the regions urbanization in the first decade of the 21st Century. This report uses comparable data on an international scale to build a foundation to help planners ensure rapid expansion of cities benefits people. This is critical for poverty reduction, because we already know that urbanization is associated with increased incomes. The new data is part of the World Banks ongoing series of initiatives to engage with governments across the region on urbanization.

    5.4A Golden Era of Insight: Big Datas Bright Future REDMOND, Wash. Feb. 15, 2013 At Microsoft Research labs around the world, some very deep thinkers are contemplating big data. This includes Eric Horvitz, distinguished scientist at Microsoft and co-director of Microsoft Researchs Redmond lab, who was recently elected to the National Academy of Engineering for his work in computational mechanisms for decision making under uncertainty and with bounded resources.He sees a future where machines, fueled by large amounts of data, can become empowering, lifelong digital companions who know what you want or need where you want to go and generally work with a passion on your behalf. Capturing data, storing it, interpreting it, and leveraging it can provide insights on small and large scales, and in high-tech and mainstream fields.Microsoft News Center recently spoke to Horvitz about how Microsoft Research (MSR) is investing time and talent in the area of big data and machine intelligence, what breakthroughs MSR has made, and his vision for the future of these fields. Looking out at the longer-term future, I expect that machine learning, and machine intelligence more broadly, is going to provide us with foundational new tools for doing scientific research, and that many breakthroughs over the next few decades will come as a collaboration between people and the machine learning and reasoning tools. There are opportunities to learn new things from large amounts of data, including getting to the bottom of healthcare mysteries by going through data with automated learning .Another direction is working to weave together a set of technologies machine learning, speech recognition, natural language understanding, machine vision and decision making to create systems that act like bright collaborators and that complement human intellect in new kinds of ways.

    5.5Data scientists: Next big opportunity for India Pradeep Thakur, TNN 2012 After the success of India's software and BPO industry, the next big thing is likely to be Big Data where US multinationals are looking at the Indian market, with business proposition worth $150 million to be created in the next few years. In what could be good news for Indians, these MNCs plan to hire around 1 lakh professionals in a new category called Data Scientists by 2014. Even Indian IT majors are building up analytics practice to compete with global MNCs. A NASSCOM-CRISIL report puts Big Data opportunity for Indian IT industry to be worth $1 billion globally by 2015. Academic courses have been designed in association with universities and were launched in Mumbai last week by a New York stock exchange listed firm, EMC Corporation, where the company has set a target of training at least 30,000 scientists for ''Big Data'' management in 2013. Harvard Business Review has termed Data Scientists to be "sexiest career of the 21st Century."India is emerging as the most lucrative market for global IT giants with independent studies projecting Big Data solutions market to double in the next two years from $80 million in 2012 to $153 million in 2014.An EMC report said, "Globally, we generated and consumed 1.8 zettabytes of data in 2011 which is expected to grow to 35 zettabytes by 2020. In India over the next decade by 2020, digital information will grow from 40,000 petabytes to 2.3 million petabytes, twice as fast as the worldwide rate." A zettabyte is a trillion gigabytes, or a billion terabytes.EMC, which has been providing data storage facility for the UPA government's ambitious Unique Identification number project, has been in talks with the government to analyze the billions of data pieces it will capture to provide solutions for efficient management of resources and study citizen's behavioral pattern to address their needs.

    5.6Apple and IBM Deliver First Wave of IBM Mobile First for Ios Apps Big Data Analytics and Security Capabilities Arrive on iPhone & iPad CUPERTINO, California and ARMONK, New YorkDecember 10, 2014 Apple and IBM today deliver the first wave of IBM Mobile First for iOS solutions in a new class of made-for-business apps and supporting cloud services that bring IBMs big data and analytics capabilities to iPhone and iPad users in the enterprise. IBM Mobile First for iOS solutions are now available to enterprise customers in banking, retail, insurance, financial services, telecommunications and for governments and airlines, thanks to an unprecedented collaboration between Apple and IBM. IBM clients today announcing support for IBM Mobile First for iOS solutions include: Citi, Air Canada, Sprint and Banorte. Apple and IBM have launched a big data and analytics platform for iOS devices, designed to help

  • ISSN: 2395-0560 International Research Journal of Innovative Engineering

    www.irjie.com Volume1, Issue 2 of February 2015

    __________________________________________________________________________________________2015 ,IRJIE-All Rights Reserved Page -62

    businesses integrate secure, analytics-based apps and link them to current processes. It can be managed and upgraded via cloud services from IBM specifically for iOS devices, making the process simple and secure for everyone involved.

    6. Unique Features 6.1 Unstructured data has never been so ubiquitous One of the elements that makes big data, well, big, is the data type. Unlike traditional business insight, which analyses structured data (the likes of which include financial details, sales and inventory), big data analytics tends to focus on unstructured data, such as emails, videos, photos and even posts on social media networks. According to a 2011 IBM report, IBM Big Data Success Stories, 90% of the world's data was created in the two years before publication. Here are some figures to help you understand just where such data is coming from: every minute, 208,300 photos are uploaded to Face book and 350,000 updates sent on Twitter. 6.2 Tools such as 'Hadoop' mean that storing large amounts of data has become incredibly cheap Although unstructured data is becoming more pervasive Hadoop, an open-source framework for storing large scale data, has developed substantially in the last decade. No longer a research project , Hadoop underpins data processing at some of the world's largest internet businesses. Why? It can deal with unstructured data and it is faster and cheaper than tools before it. 6.3 Big data analytics could lead to $610 in productivity gains in only four sectors If there was mainstream adoption of big data analytics, the retail and manufacturing industries alone could see an increase of $325 billion to their annual GDP as a result of increased efficiency, according to a report by McKinsey in July. Healthcare and government services could also see productivity gains of as much as $285 billion by 2020. 6.4 Big data analytics saw nearly $1.4 billion of VC funding in the last 12 months Venture capitalists have started to look at big data analytics with increasing scrutiny. In the last 12 months, they invested $1.37 billion into various companies, an increase of 217% over investment in the previous period. There were 19 deals in the last quarter alone, according to GP Bull hounds report.

    7. Conclusion Big data is now "enterprise-ready", i.e. it is commercially useful. It is now cheaper to store and process this data and increases in computer processing speeds mean that more businesses can leverage big data analytics. Analytics tools are opening up big data to people without specialized skills. Although only PhD-level specialists could understand the earliest versions of tools like Hadoop , new iterations and companies are democratizing big data. The most-common feature is for companies to show the data in easy-to-understand visualisations.Analysis can now be done in real-time. While Hadoop was not designed for real-time analysis, new companies are now innovating to build on the framework to give companies instant insight. Such technology is being used by companies like Hailo, the taxi-calling app, to assign drivers to prospective passengers. References

    [1]https://en.wikipedia-org/wiki/Big-data [2]https://agenda.weforum.org/2015/02/how-big-data-can-make-cities-work-for-the-poor/ [3]https://timesofindia.indiatimes.com/tech/tech-news/Data-scientists-Next-big-opportunity-for-India/articlesshow/17282008.cms [4]https://www.microsoft.com/enterprise/en-esa/it-trends/big-data/articles/a-golden-era-of-insight-big-data-s-bright-future.aspx#fbid=tdyL9dPhDgz [5]https://www.volkswagenstiftung.de [6]https://www.apple.com/pr/library/2014/12/10Apple-and-IBM-Deliver-First-Wave-of-IBM-MobileFirst-for-iOS-Apps.html [7]https://www.nature.com/nature/journal/v455/n7209/full/455028a.html [8]https://www.sciencedirect.com/science/journal/22145796/1