a simple introduction to big data and hadoop

12
BIG DATA

Upload: hamid-shekarforoush

Post on 24-Jan-2017

220 views

Category:

Software


0 download

TRANSCRIPT

Page 1: a simple introduction to big data and hadoop

BIG DATA

Page 2: a simple introduction to big data and hadoop

BIG DATA EXAMPLE

• Social media (likes, friends, videos, pictures, tweets,…)• Mobile signals , sensors ,

clicks• Online shopping, stocks• Codes• …

Page 3: a simple introduction to big data and hadoop

BUY A BOOK FROM AMAZON

• Knows what you searched for • What did you buy EVER• How much you are willing to

pay• Ask Facebook (friends, likes,

hangouts,…)• Who else is buying what?

Page 4: a simple introduction to big data and hadoop

BIG DATA USAGE ?

Page 5: a simple introduction to big data and hadoop

WHAT IS A BIG DATA?• Any data that you can not store in 1 pc• 3V (Volume, Velocity, Variety)

Page 6: a simple introduction to big data and hadoop

APACHE HADOOP

• Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

Page 7: a simple introduction to big data and hadoop

DISTRIBUTED STORAGE HDFS (HADOOP DISTRIBUTED FILE SYSTEM)

SUPER COMPUTER? NORMAL COMPUTER

Page 8: a simple introduction to big data and hadoop

WHY HDFS?

• What if something goes wrong (hardware failure)?• What is the cost of super

computer?• How easily we can add

capacity?

• Automatically handle hardware failure• Automatically backup data• Just buy new cheap

computers

Page 9: a simple introduction to big data and hadoop

DISTRIBUTED PROCESSING (MAP REDUCE)

• Count the number of trees in united states?• Solution 1: ask superman?• Solution 2: ask 1000 people?

Page 10: a simple introduction to big data and hadoop

BIG DATA USAGE IN COMPUTER SCIENCE

• Mining repositories• Ownership (plagiarism, copy

right)• Detecting code smells• Auto commenting• Predicting bugs, bug reports

Page 11: a simple introduction to big data and hadoop

OTHER TOPICS

• Data scientist• No SQL• Machine learning

Page 12: a simple introduction to big data and hadoop