data science training bangalore, itpl whitefield aug / sep 2015

4

Upload: arun-thakur

Post on 14-Apr-2017

218 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Data Science Training Bangalore, ITPL Whitefield Aug / Sep 2015
Page 2: Data Science Training Bangalore, ITPL Whitefield Aug / Sep 2015

On Saturday 22nd August 2015

From 10:00AM to 12:00PM

At InfoVision Solution India Pvt Ltd,

7th Floor, Discoverer, ITPL, Whitefield, Bangalore - 560066

Course Details

Duration: 100 Hours, One Month

Fees: INR 10,000/-

Location: InfoVision Solution India Pvt Ltd,

7th Floor, Discoverer, ITPL,

Whitefield, Bangalore - 560066

Schedule: 2-4 Hours per day, weekdays and

weekend flexi hours

Outcome: Able to independently deliver data

science projects

Certification: InfoVision Certified Data

Scientist Level 1

Page 3: Data Science Training Bangalore, ITPL Whitefield Aug / Sep 2015

Today’s competitive

battleground is fueled by

information.

Big Data solutions are

designed to capture,

process, store, and analyze

data so that the right

person gets the right

information, at the right

time.

What is Big Data?

Big data refers to datasets whose volume, velocity, variety, and complexity exceed the ability of commonly used software tools to capture, process, store, manage, and analyze. Big Data is the combination of different types of data:

•Unstructured—data communicated every day by email, phone, text, tweet, and video

•Semistructured—data generated by machines

•Structured—data traditionally stored in databases, such as account information and credit card transactions for example, log files

The challenge of Big Data is to efficiently and effectively capture, store, manage, and analyze 100 percent of the data to drive business insight and timely decisions.

What is Apache Hadoop?

Apache Hadoop is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

What is R Programming?

R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

What is Pentaho?

The Pentaho BI Project is an ongoing effort by the Open Source community to provide organizations with best-in-class solutions for their enterprise Business Intelligence (BI) needs.

Page 4: Data Science Training Bangalore, ITPL Whitefield Aug / Sep 2015

Course Contents

Statistics and

Business Analytics

40 Hours

Introduction to R Programming – R Studio, Shiny

Introduction and Data Analytics

Statistics – Mean, Mode, Median, Standard Deviation

Introduction to WEKA

Classification, Rules of Association, Regression Analysis, Cluster

Analysis

Algorithms - K-means, TwoStep, Kohonen net, Apriori and GRI

Decision Tree and Clustering

Projects : Patient Analytics, Automobile Analytics, Football

Analytics, Stock Market Analytics

Data Visualization

20 Hours

Introduction

Pentaho – Installation

Pentaho Report Designer (PRD)

PostgreSQL connection to Pentaho Tools

Pentaho Data Integration

Saiku Analytics in Pentaho

Big Data & Hadoop

Cloudera 5.3

Installation

20 Hours

Introduction To Hadoop Distributed File System (HDFS).

Understanding - Map-Reduce Basics

SQOOP / ZOOKEEPER

HBASE

PIG

HIVE

Flume

Case Studies &

Projects

20 Hours

Case Studies

Supply Chain Optimization

Genome Analysis

Project Work