introduction to data science - statinfer7. limited experience in predictive modeling techniques like...
TRANSCRIPT
![Page 1: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/1.jpg)
Introduction to Data ScienceVenkat Reddy
1
Statinfer.com
![Page 2: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/2.jpg)
Contents
•Introduction to Data Science
•Data Science Skills
•Data Science Techniques
•Data Science Tools
•Training Needs for a data scientist
2
Statinfer.com
![Page 3: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/3.jpg)
Statinfer.com
3
![Page 4: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/4.jpg)
Data Science is everywhere
Do you receive any marketing calls? Have you ever received any marketing call for
Audi car? –Marketing analytics
Have you ever wondered, why only you are getting promotional offers on cloths and
accessories where as I am getting offers on apartments? - Retail sales analytics
Why not the full talk time and message offers not same across all states & all
networks ? Telecommunications
How does a bank decide the potential fraud transactions from millions of credit
card swipes? - Fraud analytics
4
Statinfer.com
![Page 5: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/5.jpg)
What is data science?
•Data Driven decision making
•Building business strategies based on data analysis
•Removing the manual and judgemental error
•Making use of past experience(data) to make build future strategies
5
Statinfer.com
![Page 6: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/6.jpg)
Applications of Data Science
Data is growing at a rapid pace that lead to many application of data
driven strategies
•Recommendation systems
•Image recognition
•Speech recognition
•Fraud transaction identification
•Spam filtering
•Identify Sales leads
•Cross selling and upselling
6
Statinfer.com
![Page 7: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/7.jpg)
What is Data Science
•Data Science is a Fusion of many fields
•Database management
•Data Analytics
•Predictive modelling
•Machine Learning
•Big data Distributed computing
•Coding
•Data visualizations
•Reporting
7
Statinfer.com
![Page 8: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/8.jpg)
Data Science – Four Major Type of Skills
8
Database
Analytics & ML
BigdataPresentation
Statinfer.com
![Page 9: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/9.jpg)
The Techniques you need to know
Database Knowledge
•Data base Management
•Data blending
•Querying
•Data manipulations
•ETL
Predictive Analytics & ML
•Basic descriptive statistics
•Advanced analytics
•Predictive modeling
•Machine Learning
Big Data knowledge
•Distributed Computing
•Big Data analytics
•Unstructured data analysis
Presentation Skill
•Data visualizations
•Report design
• Insights presentation
9
Statinfer.com
![Page 10: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/10.jpg)
Data Science tools and software's
Database tools
SQL/MySql
OLAP cubes
Teradata
DB2/Sql Server/ Oracle/
Informix/Exadata
Analytical tools
SAS/R/SPSS/Python
Weka/MATLAB
Big Data Tools
Hadoop, Hive, Pig, Mahout, Spark, Java
Presentation Tools
Excel
Tableau, Qlikview 10
Statinfer.com
![Page 11: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/11.jpg)
Data Science -Designations
Database Developer
ETL Developer
MIS & DB Developer
Data Architect
Data Engineer
Data Analyst
Statisticians
Business Analyst
Data Scientist
Bigdata Developer
Hadoop Developer
Software Engineer
MIS Analyst
Reporting Analyst
Business Analyst 11
Statinfer.com
![Page 12: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/12.jpg)
Data Science Learning Path
12
Tools & Coding
R/SAS/Python/
Hadoop
Basic Statistics and Mathematics
Basic Algorithms -Regression, Classification and Segmentation
Advanced Algorithms -Neural Networks, SVMs and Boosting
Efficient Models & Distributed computing -Fine tuning the models and code
Statinfer.com
![Page 13: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/13.jpg)
What training should I take
13
Statinfer.com
![Page 14: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/14.jpg)
FAQ by Data Science Aspirants
•I want to be data scientist what training should I take?
•I already have knowledge on few tools, what are my next steps?
•What skill should I add to my profile to make it to next level?
•I am new to data science, where can I start ?
Statinfer.com
14
![Page 15: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/15.jpg)
Structured Learning Path to be a
successful Data Scientist
Statinfer.com
15
Step-1
Excel & SQL
Access
SAS/R/Python
Basic Statistics
Hive/Pig
Step-2
Basic Statistics & Data Cleaning
Data Analysis, Predictive Modeling
Tableau/Qlikview
Hadoop and Hive
Spark and Spark SQL
Step-3
Black box methods of ML
Unstructured Data & NOSQL
Mahout/ Spark ML
Python/Scala
Other Bigdata Tools
![Page 16: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/16.jpg)
Three Major Categories of Profiles
• You need training based on your skill level
• Based on skill set we can divide the whole data science aspirants into
four categories
1. Beginner - Completely new to Data Analytics
2. Intermediate - MIS and Reporting Analyst
3. Advanced - Data Analyst and Predictive Modeler
4. Complete Data Scientist – ML, Hadoop, R, Python, Spark
Statinfer.com
16
![Page 17: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/17.jpg)
If you are a beginner
Your Characteristics1. No hands-on experience on Data Analytics
projects
2. No hands-on experience on Data Analytics
tools like SAS & R
3. No hands-on experience on databases and SQL
4. No hands-on experience on excel
5. No idea on automation
6. No idea on data exploration and validation
7. Never worked on reporting projects
8. Very limited relevant business experience
9. Want to get started with Analytics / Data
Science
Your Training needs• Get trained on one topic at least from each of these
categories
• 1
• Excel
• Sql
• 2
• SAS programming
• R programming
• SPSS programming
• Python
• 3
• MS Access
• Excel VBA
• Hive
• Pig
• 4
• Basics of Statistics & Data Cleaning
Statinfer.com
17
![Page 18: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/18.jpg)
Suitable job profiles for beginner
1. Reporting Analyst
2. Business Analyst
3. Data Analyst
4. Analyst Associate
5. Junior data scientist
6. Hadoop Developer
7. Tableau Developer
8. Database Developer
9. ETL Developer
10.MIS Analyst 18
Statinfer.com
![Page 19: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/19.jpg)
If you are a MIS and Reporting Analyst
Your Characteristics1. Knows excel
2. Knows SQL
3. Knows Basics of SAS / R Coding
4. Relevant business experience & good domain
knowledge
5. Experience in data exploration, validation and
cleaning
6. Worked on lot of MIS and reporting projects, created
lot of dash-boards
7. Limited experience in predictive modeling techniques
like logistic regression, decision trees, time series
8. Worked on few automation projects
9. Want to get into core analytics, predictive modeling
and model building
Your Training needs
• Get trained on one topic at least from each
of these categories
• 1• Basic Descriptive Statistics & Data Cleaning
• 2• Predictive modeling techniques
• 3• Visualizations using Tableau
• Visualizations using Qlikview
• 4
• Hadoop Introduction & Hive
• Spark Introduction & Spark Sql
Statinfer.com
19
![Page 20: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/20.jpg)
Suitable job profiles for MIS and Reporting
Analyst 1. Data Analyst
2. Business Analyst
3. Junior data scientist
4. Data Scientist
5. Hadoop Developer
6. Predictive Modeler
20
Statinfer.com
![Page 21: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/21.jpg)
Course Contents
1. R programming and Basic Stats
2. Predictive modelling in R
3. Machine Learning in R
4. Machine Learning Projects
5. Bigdata & Hadoop
6. Hive and Pig
7. Python Programming
8. Machine Learning on Python
9. Machine Learning Projects in Python
10. Bigdata Projects
11. Spark and Scala(Optional)
12. Tableau (Optional) 21
Statinfer.com
![Page 22: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/22.jpg)
If you are a Data Analyst
Your Characteristics1. Knows SAS/R/Python
2. Already worked on few analytics projects
3. Already worked on predictive modeling techniques
like logistic regression, decision trees, time series
4. Have very good business knowledge
5. Never worked on bigdata
6. No hands on experience on Hadoop, Hive or Spark
7. Very limited experience in machine learning
techniques like neural networks, svm, bagging and
boosting
8. Want to get a clear idea on end to end predictive
modeling
9. Want to get into data science by learning big data and
machine learning
Your Training needs
• Get trained on one topic at least from each of
these three categories
• 1
• Black box methods of ML
• Ensemble models
• 2
• Mahout
• Spark ML
• 3
• Python
• Scala
Statinfer.com
22
![Page 23: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/23.jpg)
Suitable job profiles for Data Analyst
1. Data Scientist
23
Statinfer.com
![Page 24: Introduction to Data Science - Statinfer7. Limited experience in predictive modeling techniques like logistic regression, decision trees, time series 8. Worked on few automation projects](https://reader034.vdocuments.us/reader034/viewer/2022042312/5eda71ecb3745412b5715a50/html5/thumbnails/24.jpg)
Thank you
24
Statinfer.com