big data ecosystem
TRANSCRIPT
![Page 1: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/1.jpg)
BIGdataecosystem
Mariusz Gil
![Page 2: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/2.jpg)
/ ABOUT ME /
![Page 3: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/3.jpg)
BIG DATAThis talk is about
![Page 4: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/4.jpg)
BIG DATA?What is...
![Page 5: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/5.jpg)
![Page 6: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/6.jpg)
VOLUMElarge amounts of data
![Page 7: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/7.jpg)
VELOCITYneeds to be analyzed quickly
![Page 8: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/8.jpg)
VARIETYdifferent types of structured and unstructured data
![Page 9: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/9.jpg)
Big Data is data that is too large, complex and dynamics for any conventional data tools to capture, store, manage and analyze.
![Page 10: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/10.jpg)
30 billion pieces of content we added past month
![Page 11: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/11.jpg)
more than 2 billion videos were watched yesterday
![Page 12: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/12.jpg)
more than 58 millions messages were send yesterday
![Page 13: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/13.jpg)
/ MAIN QUESTIONS /
![Page 14: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/14.jpg)
WHY?
![Page 15: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/15.jpg)
IMPROVED RISKMANAGEMENT
49% IMPROVEDMANAGEMENT
CONTROL
36%IT ANALYSIS40%
MARKET-ORIENTEDPRODUCT DEVELOPMENT
43%
INCREASEDSALES FIGURES
32%
FINANCES ANDECONOMICS
27%
![Page 16: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/16.jpg)
![Page 17: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/17.jpg)
690 nodes Hadoop cluster for predictions and analytics
![Page 18: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/18.jpg)
HOW?
![Page 19: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/19.jpg)
![Page 20: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/20.jpg)
HDFSHADOOP DISTRIBUTED FILE SYSTEM
YARN / MapReduce v2DISTRIBUTED PROCESSING FRAMEWORK HB
ASE
COLU
MNAR
STOR
AGE
HIVE
SQL D
ATA W
AREH
OUSE
ENGIN
E
AVRO
DATA
SERIA
LIZAT
ION
MAHO
UTSC
ALAB
LE M
ACHI
NE LE
ARNI
NG
PIG SCRIP
TING F
OR LA
RGE D
ATA SE
TS
OOZIE
WORK
FLOWS
ORCH
ESTR
ATION
ZOOK
EEPE
RDIS
TRIBU
TED C
OORD
INAT
ION SE
RVICE
FLUME
LOG C
OLLE
CTOR
SQOO
PDA
TA EX
CHAN
GE
AMBARIPROVISIONING, MANAGING AND MONITORING CLUSTERS
WHIRRRUNNING CLOUD SERVICES
![Page 21: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/21.jpg)
![Page 22: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/22.jpg)
VENDORSWe can choose from multiple
like Cloudera, HortonWorks or Amazon
![Page 23: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/23.jpg)
Even from...
![Page 24: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/24.jpg)
FASTER?Can we get results
![Page 25: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/25.jpg)
Apache DrillStorm
Cloudera Impala
![Page 26: Big data ecosystem](https://reader030.vdocuments.us/reader030/viewer/2022032514/55d515b2bb61eb8a6b8b4657/html5/thumbnails/26.jpg)
thanks