Download - Entering the Data Analytics industry
![Page 1: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/1.jpg)
1
UPGRAD WORKSHOP10TH DEC’16 @HYD
ENTERING THE
DATA ANALYTICS INDUSTRY
B GANES KESARIVP, GRAMENER
![Page 2: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/2.jpg)
2
DATA ANALYTICS ?WHAT’S THE BUZZ AROUND ANALYTICS
![Page 3: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/3.jpg)
We have internal information. Getting
information from outside is our challenge. There’s no
way of doing that.
– Senior EditorLeading Media Company
“
![Page 4: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/4.jpg)
INDIA’S RELIGIONS
4
![Page 5: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/5.jpg)
AUSTRALIA’S RELIGIONS
5
![Page 6: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/6.jpg)
6
![Page 7: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/7.jpg)
WHAT ARE PEOPLE LOOKING FOR IN DATA ANALYTICS?
7
USA India
data analytics jobs
data analytics tools
data analytics salary
data analytics training
Jobs & Salary Tools Companies Training & Courses
data analytics courses
data analytics tools
data analytics jobs
data analytics companies
Source: https://google.com, https://google.co.in
![Page 8: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/8.jpg)
WHAT’S THE POPULARITY OVER TIME?
8
“Data Analytics”
Source: https://trends.google.com/
![Page 9: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/9.jpg)
WHICH CITIES HAVE INTEREST IN DATA ANALYTICS?
9Source: https://trends.google.com/
0 20 40 60 80 100 120
GurgaonPimpri-Chinchwad
NoidaBengaluruHyderabad
ChennaiSingapore
MumbaiSan Francisco
DublinBoston
WashingtonPune
HowrahToronto
New YorkSydney
New DelhiChicago
Melbourne
![Page 10: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/10.jpg)
10
WHAT’S THE STATE OF THE
DATA ANALYTICS JOBS
![Page 11: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/11.jpg)
WHO’S RECRUITING THE TEAMS?
11
0 50 100 150 200 250 300 350 400 450
IBM India
Accenture
JPMorgan
KPMG
Concentrix Daksh
Microsoft India
Ernst & Young
UnitedHealth Group
Shell India Markets
Amazon Dev Centre
GE India Technology
Hewlett-Packard
Deloitte
Cisco Systems
WNS
Xerox
eClerx Services
Mphasis
AIG Analytics
Sapient Consulting
#Jobs
Source: https://www.naukri.com
![Page 12: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/12.jpg)
WHAT INDUSTRIES USE DATA ANALYTICS?
12
0% 10% 20% 30% 40% 50% 60%
Software
Banking, Financial Services
Internet, Ecommerce
KPO, Research, Analytics
BPO, Call Centre, ITES
Recruitment, Staffing
Strategy Mgmt Consulting
Media & Entertainment
Advertising & PR
Accounting & Finance
Telcom, ISP
Education, Teaching & Training
Pharma, Biotech & Clinical Research
Insurance
FMCG, Foods & Beverage
Source: https://www.naukri.com
![Page 13: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/13.jpg)
WHAT DO THEY PAY?
13
0.0% 5.0% 10.0% 15.0% 20.0% 25.0% 30.0% 35.0%
0-3 Lakhs
3-6 Lakhs
6-10 Lakhs
10-15 Lakhs
15-25 Lakhs
25-50 Lakhs
50-75 Lakhs
75-100 Lakhs
100+ Lakhs
Source: https://www.naukri.com
![Page 14: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/14.jpg)
WHERE ARE THE DATA ANALYTICS JOBS?
14Source: https://www.naukri.com
0% 5% 10% 15% 20% 25%
Bengaluru
Delhi NCR
Mumbai
Gurgaon
Hyderabad
Others
Pune
Noida
Chennai
Delhi
![Page 15: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/15.jpg)
WHO ARE THE BIG PLAYERS IN THIS SPACE?
15Source: Gartner BI Magic Quadrant
![Page 16: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/16.jpg)
WHICH STARTUPS OFFER DATA ANALYTICS IN INDIA?
16Source: https://angel.co/... and more
![Page 17: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/17.jpg)
17
WHY DATA ANALYTICS?WHAT’S CAUSING ALL THIS BUZZ
![Page 18: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/18.jpg)
CLASSES OF ANALYTICAL SOLUTIONS
18
Proactive ActionWhat should I do to achieve my goal?Data products, data validated actions, increased success rate of strategic initiatives
ModeApproach to data Benefits
Proactive DecisionsWhat is likely to happen?Support for strategic initiatives, forward looking decision making
Proactive Consumption
ActiveWhat happened ? Marginal business benefits , process gap identification
Why did it happen? Significant improvements from status quo, data backed management
![Page 19: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/19.jpg)
19
Proactive Action
ModeApproach to data Benefits
Proactive Decisions
Proactive Consumption
ActiveOperational Reporting for measurement of
efficiency & compliance
Marginal business benefits , process gap identification
CLASSES OF ANALYTICAL SOLUTIONS
![Page 20: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/20.jpg)
TIMES NOW COVERAGE HAD
80%+ VIEWERSHIP 20
![Page 21: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/21.jpg)
21
Proactive Action
ModeApproach to data Benefits
Proactive Decisions
Proactive ConsumptionRoot Cause Analysis , Benchmarking and multi-
dimensional analysis
Significant improvements from status quo, data backed management
Active
CLASSES OF ANALYTICAL SOLUTIONS
![Page 22: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/22.jpg)
DETECTING FRAUD
“ We know meter readings are incorrect, for various reasons.
We don’t, however, have the concrete proof we need to start the process of meter reading automation.
Part of our problem is the volume of data that needs to be analysed. The other is the inexperience in tools or analyses to identify such patterns.
ENERGY UTILITY
![Page 23: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/23.jpg)
This plot shows the frequency of all meter readings from Apr-2010 to Mar-2011. An unusually large number of readings are
aligned with the tariff slab boundaries.
This clearly shows collusion of some form with the customers.
Apr-10 May-10Jun-10Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Mar-11217 219 200 200 200 200 200 200 200 350 200 200250 200 200 200 201 200 200 200 250 200 200 150250 150 150 200 200 200 200 200 200 200 200 150150 200 200 200 200 200 200 200 200 200 200 50
200 200 200 150 180 150 50 100 50 70 100 100100 100 100 100 100 100 100 100 100 100 110 100100 150 123 123 50 100 50 100 100 100 100 100
0 111 100 100 100 100 100 100 100 100 50 500 100 27 100 50 100 100 100 100 100 70 1001 1 1 100 99 50 100 100 100 100 100 100
This happens with specific customers, not randomly. Here are such customers’ meter readings.
Section Apr-10 May-10 Jun-10 Jul-10 Aug-10 Sep-10 Oct-10 Nov-10 Dec-10 Jan-11 Feb-11 Mar-11Section 1 70% 97% 136% 65% 110% 116% 121% 107% 114% 88% 74% 109%Section 2 66% 92% 66% 87% 70% 64% 63% 50% 58% 38% 41% 54%Section 3 90% 46% 47% 43% 28% 31% 50% 32% 19% 38% 8% 34%Section 4 44% 24% 36% 39% 21% 18% 24% 49% 56% 44% 31% 14%Section 5 4% 63% -27% 20% 41% 82% 26% 34% 43% 2% 37% 15%Section 6 18% 23% 30% 21% 28% 33% 39% 41% 39% 18% 0% 33%Section 7 36% 51% 33% 33% 27% 35% 10% 39% 12% 5% 15% 14%Section 8 22% 21% 28% 12% 24% 27% 10% 31% 13% 11% 22% 17%Section 9 19% 35% 14% 9% 16% 32% 37% 12% 9% 5% -3% 11%
If we define the “extent of fraud” as the percentage excess of the 100 unitmeter reading, the value varies considerably across sections, and time
New section manager arrives
… and is transferred out
… with some explainable anomalies.
Why would these happen?
Simple histograms have been applied to manage ALM compliance,fraud in corporate directorships, and collusion in schools
![Page 24: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/24.jpg)
What do the children in schools know and can do at different stages of elementary education?
Have the inputs made into the elementary education system had a beneficial effect or not?
24
![Page 25: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/25.jpg)
HAVING BOOKS IMPROVES READING ABILITYHaving more books at home improves the performance of children when it comes to reading. (But children typically only have only 1-10 books at home)
Number of students sampled
What is the impact? How many more marks can having more books fetch?
Circle size indicates number of students with this response. Few students have no books.
Is this response (“25+ books”) good or bad? Small red bars indicate low marks. Large green bars indicate high marks. Students having 25+ books tend to score high marks.
The most common response is marked in blue. This is also the circle.
The graphic is summarized in words
Indicates whether the best response is the most popular. Blue means that it is not. Green means that it is. Red means that the worst level is the most popular response.
25
![Page 26: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/26.jpg)
HAVING MORE SIBLINGS DOESN’T HELP READINGChildren with 1 sibling do much better than children with many siblings
26
![Page 27: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/27.jpg)
… BUT HELPS A LOT IN MATHEMATICSChildren with 4+ siblings do very well, children with 1 sibling fare poorly
27
![Page 28: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/28.jpg)
TUITIONS HELP A LITTLE
… BUT NOT CHILDREN WITH 4+ SIBLINGS
28
![Page 29: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/29.jpg)
TUITIONS HELP A LITTLE
… BUT NOT CHILDREN OF ILLITERATE PARENTS
29
![Page 30: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/30.jpg)
CHILDREN LIKE GAMES, AND THEY’RE GOOD
… but playing daily hurts reading ability30
![Page 31: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/31.jpg)
31
Proactive Action
ModeApproach to data Benefits
Proactive Decisions
Proactive Consumption
Active
Statistical Analysis thru Segmentation, Decision Trees and
Cause-effect Modelling
Support for strategic initiatives, forward looking decision making
CLASSES OF ANALYTICAL SOLUTIONS
![Page 32: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/32.jpg)
32
Telecommunication
“ How to predict customer churn, atleast a month ahead”
![Page 33: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/33.jpg)
33
Background & Objective
Gramener Approach
Customer churn is a well noted problem in telecom industry today. One of the leading telecom operator in the country wanted to predict the churn rate 2/4 week before using an analytical model.
Exploratory Analysis & influencers
Predictive Intervention
Linear Discriminant Parameters
Exploratorybusinessanalysisperformedtoidentifyinfluencers&createadditionalderivedmetrics&deriveddimensions
Usingselectivemetrics,modelswerebuiltonLinearClassificationlikeDecisiontrees,LinearDiscriminantParameters
Non – Linear Models
Usingselectivemetricsnon-linearfamiliesofmodelswerebuilt:NeuralNetworks,RandomForests&SupportVectorMachines
• Thebestmodelwasimplemented&comparedwithacontrolset
• Targetedpromotions forpredictedsetyielded~60%reductioninchurn
CLASSES OF ANALYTICAL SOLUTIONS
![Page 34: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/34.jpg)
MODEL BUILDING & FINE-TUNING
ModelsDeployed
üPair-wisecorrelationüMulti-linearregressionüLinearDiscriminantAnalysisüDecisionTreeüSupport VectorMachinesüNeuralNetworksüRandomForest
OtherVariability
üPredictDurationüAgeingofmodel
InputMetrics- Customer
ü Incoming&OutgoingMinutesü Incoming&OutgoingCallsü DailyMobileUsageü ClosingBalanceü Customeractivationdate
Input- Derived&GrowthMetrics
ü Last/AverageClosingbalanceinamonthü Dayssincethe lastOutgoingCallü Dayssincethe lastRechargeü TotalDecrementü MonthlyRefillAmountü TotalMinutesincl Incoming&Outgoingü PercentageofIncomingMinutesü RechargeValues
![Page 35: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/35.jpg)
8.3% 0.0%
MISSED WASTED
6.61COST PER CUST.
0.0%
IMPROVEMENT
Base
MODEL
OK
WASTED
Marketing costRs 40
MISSED
Acquisition costRs 80
OK
No churn Churn
No
chur
nCh
urn
Prediction
Actu
al
![Page 36: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/36.jpg)
~1-2% ~2-3%
MISSED WASTED
~2.0-3.0
COST PER CUST.
~40-50%
IMPROVEMENT
Random Forest/SVM/etc
MODELS
![Page 37: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/37.jpg)
37
Proactive Action
ModeApproach to data Benefits
Proactive Decisions
Proactive Consumption
Active
Data driven decision making, thru advanced mathematical models and
scenario planning
Data products, data validated actions, increased success rate of strategic initiatives
CLASSES OF ANALYTICAL SOLUTIONS
![Page 38: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/38.jpg)
HEURISTICS
EMERGENCY
“ A man is rushed to a hospital in the throes of a heart attack.
The nurse needs to decide whether the victim should be admitted into emergency care.
Although this decision can save or cost a life, the nurse must decide using only the available cues, and within a few seconds – preferably using some fancy statistical software package.
![Page 39: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/39.jpg)
HEURISTICS
EMERGENCY
Pressure < 91
Age > 62
Pulse > 100
No Yes
No Yes
No Yes
![Page 40: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/40.jpg)
VISUAL ANALYTICS IS IMPERATIVE FOR
ANALYTICS →INSIGHTS →ACTIONSpot the unusual Communicate patterns Simplify decisions
![Page 41: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/41.jpg)
41
SKILLS & ROLESTHAT YOU SHOULD PICK UP
![Page 42: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/42.jpg)
SO, WHAT’S THE SKILL NEEDED TO CREATE THESE?
42
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
![Page 43: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/43.jpg)
…AND WHAT ARE THE ROLES AVAILABLE?
43
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
Data Scientist
![Page 44: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/44.jpg)
SO, WHAT’S THE SKILL NEEDED TO CREATE THESE?
44
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
Functional Consultant
![Page 45: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/45.jpg)
SO, WHAT’S THE SKILL NEEDED TO CREATE THESE?
45
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
Information Designer
![Page 46: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/46.jpg)
SO, WHAT’S THE SKILL NEEDED TO CREATE THESE?
46
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
Data Analyst
![Page 47: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/47.jpg)
SO, WHAT’S THE SKILL NEEDED TO CREATE THESE?
47
Deep Domain Expertise
Visual Design & Presentation
Deep Programming
Statistics & Machine Learning
Passion for NumbersDomain Orientation
Data ScientistFunctional Consultant
Information Designer Data Analyst
![Page 48: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/48.jpg)
48
TOOLS & SOFTWARETHAT YOU SHOULD BE LOOKING AT
![Page 49: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/49.jpg)
THE DATA SCIENCE TOOLKIT
AlteryxAmazon EC2Azure MLBigQueryBirstCaffeCassandraCloud ComputeClouderaCognosCouchDBD3Decision treeElasticSearchExcelGephi
ggplot2HadoopHP VerticaIBM WatsonImpalaJuliaJupyter NotebookKafkaKibanaKinesisLambdaLeafletLogstashMapRMapReduceMatplotlibMicrostrategy
MongoDBNodeXLPandasPentahoPivotalPowerPointPower BIQlikviewRR StudioRandom ForestRedisRedshiftRegressionRevolution RS3SAP Hana
SASSparkSpotfireSPSSSQL ServerStanford NLPStormSVMTableauTensorFlowTeradataTheanoThriftTorchWekaWord2Vec
The tool does not matter. A person’s skill with the tool does.Pick the person. Let them pick the tool.
![Page 50: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/50.jpg)
50
TRAINING & COURSESTHAT WILL HELP YOU ENTER THE INDUSTRY
![Page 51: Entering the Data Analytics industry](https://reader031.vdocuments.us/reader031/viewer/2022021922/586f870f1a28ab54768b570d/html5/thumbnails/51.jpg)
SELF-LEARNING
51
TAILORED COURSES
LEARN ON THE JOB