business analytics - big data

14
BIG DATA EMERGING JUGGERNAUT IN IT INDUSTRY J ENNIFER B OOMERSHINE R AJAT R AJAT S COTT S PRADLING I DEAN T AGHEIZADEH R OBERT H URLEY

Upload: bobby-hurley

Post on 22-Jan-2018

156 views

Category:

Education


0 download

TRANSCRIPT

BIG DATAEMERGING JUGGERNAUT IN

IT INDUSTRY

JENNIFER BOOMERSHINE

RAJAT RAJAT

SCOTT SPRADLING

IDEAN TAGHEIZADEH

ROBERT HURLEY

BIG DATA

BIG DATA DESCRIBES THE APPLICATION OF NEW TOOLS AND

TECHNIQUES TO DIGITAL INFORMATION ON A SIZE AND SCALE WELL

BEYOND WHAT WAS POSSIBLE WITH TRADITIONAL APPROACHES,

TYPICALLY INVOLVING DATA SETS THAT ARE SO LARGE AND

COMPLEX THAT THEY REQUIRE ADVANCED DATA STORAGE,

MANAGEMENT, ANALYTICS, AND VISUALIZATION TECHNOLOGIES.

Abstracts of Papers, 250th ACS National Meeting & Exposition, Boston, MA, United

States, August 16-20, 2015 (2015), ANYL-28.

DATA IS GENERATED BY:

• Airlines: 10 TB every 30 minutes 640 TB of data per flight

• Internet User: By the end of 2015 internet traffic will exceed 4.8 ZB per year.

• Emails: 300 billion emails are sent every day.

• Facebook: 25 TB of data daily.

• Twitter: 12 TB of data daily. About 97000 tweets are sent every second.

• YouTube: 2.9 billion video hours are watched on YouTube per month.

• Trading: NYSE produces 1 TB of data per trading day.

• Experiments: atomic particles 40 TB per second.

FACTS

THREE V’S

10% STRUCTURED

90% UNSTRUCTURED

Big Data

Velocity

Real Time

Batches

Data Streams

Variety

Structured

Unstructured

Weakly correlated

Volume

Tables

Petabytes

Transactions

HOW MUCH?

DC's Digital Universe Study, sponsored by EMC Corporation , December

2012 BioTechnology: An Indian Journal (2014), 10(15, Pt.. 4), 8811-8816.

50 fold increase in data volume0.7 ZetaByte in 2009

35 ZetaByte in 2020

Unstructured Data

Structured Data

“Smart Data”

IN ORDER TO HANDLE BIG DATA, WE NEED SMART DATA

PROCESS

Abstracts of Papers, 250th ACS National Meeting & Exposition, Boston, MA, United

States, August 16-20, 2015 (2015), ANYL-28.

CRITICAL SOLUTION

OPEN SOURCE FRAMEWORK WITH MASSIVE PARALLEL PROCESSING

• APACHE HADOOP

• PYTHON

• MAPREDUCE

• GOOGLE FILE SYSTEM

• SAAS

Master node manages file location & failuresSpecific tasks are assigned to individual nodes

Scale: 1 MB to 1GB same protocol

Scaling is linear: increase processing by increase # of computers

COMPETITIVE EDGE

COMPANIES WILL GAIN COMPETITIVE EDGE BY:

• COLLECTING, ANALYZING, & UNDERSTANDING INFORMATION

• REDUCE MARKETING TIME OF PRODUCTS

• INCREASE THE SUCCESS RATE OF PRODUCT TRANSACTIONS

GOVERNMENTAL USE - PROACTIVE ACTIONS

• PROJECT VOTE SMART R –POLITICAL SCIENCE, SOCIAL, & ECONOMIC DATA

• PVSR IS A SOFTWARE TO ANALYZE & CREATE USEFUL INFORMATION

• GOOGLE FLU TRENDS (GFT) : BUILDING EFFECTIVE STRATEGY

• INFLUENZA AFFECT 5-20% OF THE U. S. POPULATION EVERY YEAR, RESULTING IN OVER 200,000 HOSPITALIZATIONS.

DUMBILL, E. (2013). MAKING SENSE OF BIG DATA. BIG DATA, 1(1), 1-2.

POTENTIAL OF SMART DATA

ACKNOWLEDGEMENT

UOFL SCIFINDER

UOFL PUBMED

DR. MANJU AHUJA

PROFESSIONAL MBA COHORT

COLLEGE OF BUSINESS

REFERENCES

ABSTRACTS OF PAPERS, 250TH ACS NATIONAL MEETING & EXPOSITION, BOSTON, MA, UNITED STATES, AUGUST 16-20, 2015 (2015), ANYL-28.

BOYD, D., & CRAWFORD, K. (2012). CRITICAL QUESTIONS FOR BIG DATA: PROVOCATIONS FOR A CULTURAL, TECHNOLOGICAL, AND SCHOLARLY

PHENOMENON. INFORMATION, COMMUNICATION & SOCIETY, 15(5), 662-679.

CHEN, H., CHIANG, R. H., & STOREY, V. C. (2012). BUSINESS INTELLIGENCE

AND ANALYTICS: FROM BIG DATA TO BIG IMPACT. MIS QUARTERLY, 36(4), 1165-1188.

DUMBILL, E. (2013). MAKING SENSE OF BIG DATA. BIG DATA, 1(1), 1-2.

DC'S DIGITAL UNIVERSE STUDY, SPONSORED BY EMC CORPORATION , DECEMBER 2012 BIOTECHNOLOGY: AN INDIAN JOURNAL (2014), 10(15, PT.. 4), 8811-8816.