new batches info - quality thought introduction apache kafka what is kafka? need for kafka core...

11
QUALITY THOUGHT Hadoop Course Content QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 1 Email Id: [email protected] START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

Upload: vongoc

Post on 23-Mar-2018

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 1 Email Id: [email protected]

START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS :

We are ready to serve Latest Testing Trends, Are you ready to learn??

New Batches Info

Page 2: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 2 Email Id: [email protected]

Introduction about Hadoop/Bigdata:

Hadoop is an open-source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model. It consists of computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers packaged code into nodes to process the data in parallel. This approach takes advantage of data locality, where nodes manipulate the data they have access to. This allows the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking.

The base Apache Hadoop framework is composed of the following modules:

Hadoop Common – contains libraries and utilities needed by other Hadoop modules;

Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster;

Hadoop YARN – a platform responsible for managing computing resources in clusters and using them for scheduling users' applications; and

Hadoop MapReduce – an implementation of the MapReduce programming model for large-scale data processing

Offered Tools in Hadoop:

Covered Tools

1. Hadoop Architecture 2. HDFS 3. MapReduce 4. Pig 5. Hive

6. Sqoop 7. No SQL 8. HBase 9. Oozie 10. YARN 11. ZooKepeer

1. Spark 2. Scala 3. Kafka 4. No SQL - Cassandra 5. Nifi

6. IOT 7. Flink

Page 3: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 3 Email Id: [email protected]

Understanding Big Data and Hadoop

What is Big Data

3 V's Concepts

Diff Problems and Solutions of Bigdata

Hadoop Architecture

What is Big Data

What is Hadoop and History of Hadoop

Hadoop Architecture

Hadoop ecosystem components

Hadoop Storage: HDFS

Hadoop Processing: MapReduce Framework

Hadoop Server Roles: NameNode Secondary NameNode, and DataNode

Anatomy of File Write and Read.

Different Components of Hadoop.

HDFS ( Hadoop Distributed File System)

Significance of HDFS in Hadoop

Features of HDFS

5 Daemons of Hadoop

NameNode and its functionality

DataNode and its functionality

JobTracker and its functionality

TaskTrack and its functionality

Secondary NameNode and its functionality

Data Storage in HDFS

Introduction about Blocks

Data Replication

Data storage in Data Nodes

Replication Configuration

Custom Replication

Fail Over Mechanism

Design Constraints

Replication Factor

Changing block size for file and Directory

Page 4: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 4 Email Id: [email protected]

Hadoop Cluster Configuration and Data Loading

Hadoop Cluster Architecture

Hadoop Cluster Configuration files

Hadoop Cluster Modes

MapReduce Job execution

Common Hadoop Shell commands

Hadoop Copy Commands

Introduction about Blocks

Data Replication

Hadoop MapReduce framework

MapReduce Architecture

Hadoop Data Types

Hadoop MapReduce paradigm

Mapper and Reducer tasks

MapReduce Execution Framework

Partitions and Combiners

Hands on MapReduce Programming.

Advance MapReduce

MapReduce Programming Model

Different Phases of MapReduce Algorithm

How to write a basic MapReduce Program

The Driver Code The Mapper The Reducer

Joining Data Sets in MapReduce Jobs- MapJoins and Reduce Joins Creating Input and Output formats in MapReduce Jobs

Text Input Format Key Value Input Format Sequence file Input Format

How to Debug MapReduce Jobs in Eclipse Data Localization in MapReduce

Page 5: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 5 Email Id: [email protected]

Combiner ( Mini Reducer) and Partitioner Speculative execution on Mappers and Reducers

Distributed Cache

Counters, Custom Writable

Secondary Sorting Using Mapreduce

Apache PIG and Pig Latin

Introduction to Apache PIG

MapReduce Vs. Apache PIG

SQL Vs. Apache PIG

Physical & Logical Layer

Different Data types in Apache PIG

Modes of Execution in Apache PIG

Local Mode, Map Reduce or Distributed Mode

Execution Mechanism

Grunt shell, Script, Embedded

Transformations in PIG

How to write a simple PIG Script

UDFs in PIG

Hands on with PIG latic script

Hive and HiveQL

HIVE Introduction

Hive Architecture and Installation

Comparison with Traditional Database

Operators and Functions

Hive Meta Store and Integration with MySql

Hive integration with Hadoop

SQL vs. HIVE QL

Hive UDF's

Partitioning, Dynamic Partitioning and Bucketing

Hive SerDe (Serialization and Desensitization)

RegexSerDe (Regular Expressions)

Page 6: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 6 Email Id: [email protected]

Hive Tables (Managed Tables and External Tables, Storage Formats, Importing Data,

Altering Tables, Dropping Tables)

Hive data format – Text, ORC, Avro, parquet

SQOOP

Introduction to SQOOP

How to connect relational database using SQOOP

Different Sqoop Commands

Different flavours of imports, Export, HIVE imports

Hadns on with Examples

HBase and ZooKeeper

HBase introduction

HBase use cases

HBase basics

Column families, Scans

HBase architecture

ZooKeeper Service:

Data Model, Operations, Implementation, Consistency, Sessions, States

HBase Admin

Schema definition, Basic CRUD Operations

Flume

Flume Introduction Flume Architecture Flume Master, Flume collector and Flume Agent Real time example with Twitter

Oozie

Oozie Introduction Oozie Architecture

Page 7: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 7 Email Id: [email protected]

Oozie Configuration files Oozie Job Submission

o workflow.xml o Coordinator.xml o Job.coordinator.properties

Hadoop 2.0, MRv2 and YARN

Hadoop 2.0 New Features: NameNode High Availability HDFS Federation MRv2, YARN, Running MRv1 in YARN,

Apache Spark with Scala

Introduction to Scala

Why Scala Scala Vs Java Scala Basics Scala Data types Scala Packages Variable Declarations Variable Type Inference Control Structures Interactive Scala – Scala Shell Writing Scala Scripts – Compiling the Scala Programs Defining Functions in Scala Different IDEs for Scala

SPARK

Introduction to Spark Motivation for Spark Spark Vs Map Reduce Processing Architecture of Spark Spark Shell Introduction Creating Spark Context File Operations in Spark Shell Caching in Spark Real time Examples of Spark Spark Components

o Spark Core o Spark SQL

Page 8: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 8 Email Id: [email protected]

o Spark Streaming o Spark MLLib

Features of RDD Lazily Evaluated Immutable Partitioned RDD operations Actions Transformation in RDD

KAFKA:

Introduction Apache Kafka

What is Kafka? Need for Kafka Core Concepts of Kafka Kafka Architecture Where is Kafka Used?

Deep Dive into Kafka Cluster

Understanding the components of Kafka Cluster, Installation of Kafka Cluster, Configuring Kafka Cluster, Producer of Kafka, Consumer of Kafka, Producer and Consumer in Action.

Kafka Operations and Performance Tuning

Offset Design Hardware, Kafka Monitoring and Issues Kafka Performance Tuning Reading data from Kafka Demo-Twitter Kafka Producer

Kafka with Spark

Ecosystem of Spark Understanding the Spark Cluster Integrating Kafka with Spark

Page 9: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 9 Email Id: [email protected]

NoSQL Cassandra:

Introduction of BigData and NoSQL

What is Big Data What is SQL What is NOSQL Brewer’s CAP Theorem

Introducing Cassandra

Distributed and Decentralized Cassandra Architecture

o Ring distributed architecture o Peer-to-Peer o Gossip protocol o Failure Detection

Elastic Scalability High Availability and Fault Tolerance Tunable Consistency Column-Oriented Schema-Free High Performance

Installing Cassandra

Installation of DataStax Cassandra Installation of Dev Center. CQL – Cassandra Query Language Keyspaces CQL Tables Partition Keys / Primary Key Cluster Keys Composite Keys Secondary Indexes Materialized Views

Java with Cassandra

Installing Oracle JDK 1.7 DataStax Java driver API Cassandra & Java Datatypes (Blob) Sample Java application with Cassandra Driver Hands on with java connector with cassandra

Page 10: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 10 Email Id: [email protected]

Java Web Application with Cassandra data base

Table operations from Java application o Create Keysapce o Create Table, index and ect. o Insert, Update, and Alter using Collection data types

Sample Java Web application with Cassandra on App server

Advanced topics of Cassandra

Eventual Consistency CQL Batch Security Durability Eventual Consistency & Tunable Consistency Multi DataCenters Alter & Drop Table Commit Log Mem Table SS Table How writes works How reads works Scalability Partitioning Replication Compaction Size tiered Leveled CQL Handling Blobs

IoT (Internet of Things):

1. IoT-Introduction

Introducing IoT elements of IoT Real World IoT Applications

2. IoT-Architecture

Elements of IoT Architecture Sensors Actuators Gateway IoT Platforms and Analytics

4. Communication Protocols

Page 11: New Batches Info - Quality Thought Introduction Apache Kafka What is Kafka? Need for Kafka Core Concepts of Kafka ... MQTT CoAP XMPP By Sathish . Title:

QUALITY THOUGHT Hadoop Course Content

QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 11 Email Id: [email protected]

Wide area communication protocols-

Cellular sigfox satellite.

Cmmunication protocols-

MQTT CoAP XMPP

By Sathish