understanding big data, a business perspective
DESCRIPTION
Understanding Big Data a Business Perspective: Explains Big Data in terms of usecases & infrastructure, without diving much unto the technology.TRANSCRIPT
Understanding
Big Data[ A Business Perspective ]
by
Shiva Dasharathi
1: Big Data ?
3:How different it is from BI, HANA etc. ?
2: Benefits ?
4:Infrastructure? hadoop? Nosql?
5:Challenges in Migrating to big data?
1: Big Data ?
What is big data?
Terminology
Big Data evolution
Anatomy of usecases
Big Data “Any Data that is worth analysing”
Variety
of data
Real time analysis : for,- Feedbacks, complaints about your products & services in social media-Capturing Customer behaviour-Predicting things before they happen
Velocity
&
Volume of data
Big Data Characteristics of big data:
1. Volume of data2. Velocity of Data3. Variety of Data
* Difficult to handle in traditional ways
Complex
Big Data > TerminologyStructured data:
NAME NATURE TAGShiva Thinking -- Forgetful PhilosophyShravanthi Innocent – Sensitive -- Journalist ChampionSubhash Artistic -- Descretive ChampionShreyas Logical -- Passionate ChampionAdithya Logical -- Articulative ChampionPallavi Outspeaking -- Friendly ChampionSonal Dancer -- Sportive ChampionNikhil Poetic -- Sportive ChampionGayathri Prude -- Honest ChampionAnitha Blesser -- Gentle ChampionAmandeep Aesthetic -- Independent ChampionMalathi un known ChampionAnkitha Cricketer -- Gentle ChampionVikram Logical -- Articulative ChampionAnkesh Honest – Passionate ChampionTejo Managerial – Patience LeadingCJ Quick – Logical ChampionDeba Dedication – Honest ChampionCharu Managerial – Independent LeadingAshok Social – Helping LeadingSijesh Analytical – Eager -- Helping Balanced LeadingTarun Shrewed – Responsive LeadingPavan Optimization – Shrewed AdministrativeBhargav Balancing – Friendly ChampionSurya Enthusiasm – Learning VersatileSwarnav Social – Outspeaking AdministrativeBidisha Outspeakiing – Social AdministrativeNiranjan Articulative -- Friendly Leading
Big Data evolution
Commodity/Cheap Hardware
Open source software
Valuable Data
Data mining / Data analysis using statistical modell ing techniques
Health care BankingEnergy & Utilities TelecomSupply chain Retail
RealestateAgriculture Sustainability etc..
Social networking
web sites
Search engines
Job portals
News portals
Travel
Recommendation
Apps
Online movie stores
Animation industry
etc..
Space research
Bio researchImage processing etc..
Enterprise AnalyticsSocial media Apps &
Analytics
Research OrientedFields
Big Data > Anatomy of usecases?
2: Benefits ?
Advanced Predictions
use of predictions
2: Benefits ? Advanced Predictions
Source 1
Source 2
Source 3
Knowledge Models
Predict ions:
Revenue / spend forecasts;
Sentiment analysis;
Customer Behaviour analysis
etc..
Data mining techniques
2: Benefits ?
use of predictions
- Helps to understand & optimize the complex business processes- predict opportunities / risks- Understand strengths / weaknesses- Optimizing resource usages
*At much cheaper costs.
3:How different it is from traditional DBs?
Big Data Vs RDBMS
Big Data Vs BW, HANA
Big Data Vs RDBMS
. Big Data tools ->- More for data Analysis
- Lack Transaction system capabilities- Do not fully comply with A C I D properties- Doesn’t allow to apply constraints at data level
* Do not compete with OLTP systems
Big Data Vs BW (or) HANA
- Limited by Scalability (vertical scaling)
-Can not deal with unstructured data
-Not fault tolerant
- Not well integrated with open source analytical / data mining tools
4. Infrastructure? hadoop? Nosql?
What is NOSQL
What is Hadoop
What is Hadoop
A distributed, parallel, data processing system (a type of NOSQL)
map
map
map
mapreduce
reduce
Storage : HDFSProgramming api: Mapreduce
Well established Eco system:Hive, Pig, Hbase, Sqoop, oozy,
Flume, Mahout, zookeeper etc…
Input filesOutput files
What is NOSQL
Nosql -> Not Only SQL
Highly AvailableDistributedFault tolerantSchema less
NOSQL DBs:
Key Value based: Hadoop
Columnar based: Hbase, Cassandra
Graph based: Neo4j, Orient DB
Document based: Mango DB, Couch DB
5:Challenges ? What to consider ?
What data your organization has? Size & Sources?
What Analytical usecases to be implemented ?
What infrastructure you have?
Which hardware to buy? What configuration?
What NOSQL DBs you need ? Which vendor to approach?
Migration Plan?
1. What is Big Data?2. How will I benefit?3. I don’t have huge data, should I still consider big
data?4. I already have BI or HANA etc. setup, Can I leverage
them?5. What challenges may I face?6. How much can I Save
(Questions Covered in the session)
Thank You @shivadasharathi