new batches info - quality thought introduction apache kafka what is kafka? need for kafka core...
TRANSCRIPT
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 1 Email Id: [email protected]
START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS :
We are ready to serve Latest Testing Trends, Are you ready to learn??
New Batches Info
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 2 Email Id: [email protected]
Introduction about Hadoop/Bigdata:
Hadoop is an open-source software framework used for distributed storage and processing of dataset of big data using the MapReduce programming model. It consists of computer clusters built from commodity hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.
The core of Apache Hadoop consists of a storage part, known as Hadoop Distributed File System (HDFS), and a processing part which is a MapReduce programming model. Hadoop splits files into large blocks and distributes them across nodes in a cluster. It then transfers packaged code into nodes to process the data in parallel. This approach takes advantage of data locality, where nodes manipulate the data they have access to. This allows the dataset to be processed faster and more efficiently than it would be in a more conventional supercomputer architecture that relies on a parallel file system where computation and data are distributed via high-speed networking.
The base Apache Hadoop framework is composed of the following modules:
Hadoop Common – contains libraries and utilities needed by other Hadoop modules;
Hadoop Distributed File System (HDFS) – a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster;
Hadoop YARN – a platform responsible for managing computing resources in clusters and using them for scheduling users' applications; and
Hadoop MapReduce – an implementation of the MapReduce programming model for large-scale data processing
Offered Tools in Hadoop:
Covered Tools
1. Hadoop Architecture 2. HDFS 3. MapReduce 4. Pig 5. Hive
6. Sqoop 7. No SQL 8. HBase 9. Oozie 10. YARN 11. ZooKepeer
1. Spark 2. Scala 3. Kafka 4. No SQL - Cassandra 5. Nifi
6. IOT 7. Flink
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 3 Email Id: [email protected]
Understanding Big Data and Hadoop
What is Big Data
3 V's Concepts
Diff Problems and Solutions of Bigdata
Hadoop Architecture
What is Big Data
What is Hadoop and History of Hadoop
Hadoop Architecture
Hadoop ecosystem components
Hadoop Storage: HDFS
Hadoop Processing: MapReduce Framework
Hadoop Server Roles: NameNode Secondary NameNode, and DataNode
Anatomy of File Write and Read.
Different Components of Hadoop.
HDFS ( Hadoop Distributed File System)
Significance of HDFS in Hadoop
Features of HDFS
5 Daemons of Hadoop
NameNode and its functionality
DataNode and its functionality
JobTracker and its functionality
TaskTrack and its functionality
Secondary NameNode and its functionality
Data Storage in HDFS
Introduction about Blocks
Data Replication
Data storage in Data Nodes
Replication Configuration
Custom Replication
Fail Over Mechanism
Design Constraints
Replication Factor
Changing block size for file and Directory
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 4 Email Id: [email protected]
Hadoop Cluster Configuration and Data Loading
Hadoop Cluster Architecture
Hadoop Cluster Configuration files
Hadoop Cluster Modes
MapReduce Job execution
Common Hadoop Shell commands
Hadoop Copy Commands
Introduction about Blocks
Data Replication
Hadoop MapReduce framework
MapReduce Architecture
Hadoop Data Types
Hadoop MapReduce paradigm
Mapper and Reducer tasks
MapReduce Execution Framework
Partitions and Combiners
Hands on MapReduce Programming.
Advance MapReduce
MapReduce Programming Model
Different Phases of MapReduce Algorithm
How to write a basic MapReduce Program
The Driver Code The Mapper The Reducer
Joining Data Sets in MapReduce Jobs- MapJoins and Reduce Joins Creating Input and Output formats in MapReduce Jobs
Text Input Format Key Value Input Format Sequence file Input Format
How to Debug MapReduce Jobs in Eclipse Data Localization in MapReduce
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 5 Email Id: [email protected]
Combiner ( Mini Reducer) and Partitioner Speculative execution on Mappers and Reducers
Distributed Cache
Counters, Custom Writable
Secondary Sorting Using Mapreduce
Apache PIG and Pig Latin
Introduction to Apache PIG
MapReduce Vs. Apache PIG
SQL Vs. Apache PIG
Physical & Logical Layer
Different Data types in Apache PIG
Modes of Execution in Apache PIG
Local Mode, Map Reduce or Distributed Mode
Execution Mechanism
Grunt shell, Script, Embedded
Transformations in PIG
How to write a simple PIG Script
UDFs in PIG
Hands on with PIG latic script
Hive and HiveQL
HIVE Introduction
Hive Architecture and Installation
Comparison with Traditional Database
Operators and Functions
Hive Meta Store and Integration with MySql
Hive integration with Hadoop
SQL vs. HIVE QL
Hive UDF's
Partitioning, Dynamic Partitioning and Bucketing
Hive SerDe (Serialization and Desensitization)
RegexSerDe (Regular Expressions)
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 6 Email Id: [email protected]
Hive Tables (Managed Tables and External Tables, Storage Formats, Importing Data,
Altering Tables, Dropping Tables)
Hive data format – Text, ORC, Avro, parquet
SQOOP
Introduction to SQOOP
How to connect relational database using SQOOP
Different Sqoop Commands
Different flavours of imports, Export, HIVE imports
Hadns on with Examples
HBase and ZooKeeper
HBase introduction
HBase use cases
HBase basics
Column families, Scans
HBase architecture
ZooKeeper Service:
Data Model, Operations, Implementation, Consistency, Sessions, States
HBase Admin
Schema definition, Basic CRUD Operations
Flume
Flume Introduction Flume Architecture Flume Master, Flume collector and Flume Agent Real time example with Twitter
Oozie
Oozie Introduction Oozie Architecture
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 7 Email Id: [email protected]
Oozie Configuration files Oozie Job Submission
o workflow.xml o Coordinator.xml o Job.coordinator.properties
Hadoop 2.0, MRv2 and YARN
Hadoop 2.0 New Features: NameNode High Availability HDFS Federation MRv2, YARN, Running MRv1 in YARN,
Apache Spark with Scala
Introduction to Scala
Why Scala Scala Vs Java Scala Basics Scala Data types Scala Packages Variable Declarations Variable Type Inference Control Structures Interactive Scala – Scala Shell Writing Scala Scripts – Compiling the Scala Programs Defining Functions in Scala Different IDEs for Scala
SPARK
Introduction to Spark Motivation for Spark Spark Vs Map Reduce Processing Architecture of Spark Spark Shell Introduction Creating Spark Context File Operations in Spark Shell Caching in Spark Real time Examples of Spark Spark Components
o Spark Core o Spark SQL
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 8 Email Id: [email protected]
o Spark Streaming o Spark MLLib
Features of RDD Lazily Evaluated Immutable Partitioned RDD operations Actions Transformation in RDD
KAFKA:
Introduction Apache Kafka
What is Kafka? Need for Kafka Core Concepts of Kafka Kafka Architecture Where is Kafka Used?
Deep Dive into Kafka Cluster
Understanding the components of Kafka Cluster, Installation of Kafka Cluster, Configuring Kafka Cluster, Producer of Kafka, Consumer of Kafka, Producer and Consumer in Action.
Kafka Operations and Performance Tuning
Offset Design Hardware, Kafka Monitoring and Issues Kafka Performance Tuning Reading data from Kafka Demo-Twitter Kafka Producer
Kafka with Spark
Ecosystem of Spark Understanding the Spark Cluster Integrating Kafka with Spark
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 9 Email Id: [email protected]
NoSQL Cassandra:
Introduction of BigData and NoSQL
What is Big Data What is SQL What is NOSQL Brewer’s CAP Theorem
Introducing Cassandra
Distributed and Decentralized Cassandra Architecture
o Ring distributed architecture o Peer-to-Peer o Gossip protocol o Failure Detection
Elastic Scalability High Availability and Fault Tolerance Tunable Consistency Column-Oriented Schema-Free High Performance
Installing Cassandra
Installation of DataStax Cassandra Installation of Dev Center. CQL – Cassandra Query Language Keyspaces CQL Tables Partition Keys / Primary Key Cluster Keys Composite Keys Secondary Indexes Materialized Views
Java with Cassandra
Installing Oracle JDK 1.7 DataStax Java driver API Cassandra & Java Datatypes (Blob) Sample Java application with Cassandra Driver Hands on with java connector with cassandra
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 10 Email Id: [email protected]
Java Web Application with Cassandra data base
Table operations from Java application o Create Keysapce o Create Table, index and ect. o Insert, Update, and Alter using Collection data types
Sample Java Web application with Cassandra on App server
Advanced topics of Cassandra
Eventual Consistency CQL Batch Security Durability Eventual Consistency & Tunable Consistency Multi DataCenters Alter & Drop Table Commit Log Mem Table SS Table How writes works How reads works Scalability Partitioning Replication Compaction Size tiered Leveled CQL Handling Blobs
IoT (Internet of Things):
1. IoT-Introduction
Introducing IoT elements of IoT Real World IoT Applications
2. IoT-Architecture
Elements of IoT Architecture Sensors Actuators Gateway IoT Platforms and Analytics
4. Communication Protocols
QUALITY THOUGHT Hadoop Course Content
QUALITY THOUGHT * www.facebook.com/qthought * www.qualitythought.in PH NO: 9963799240, 040-40025423 11 Email Id: [email protected]
Wide area communication protocols-
Cellular sigfox satellite.
Cmmunication protocols-
MQTT CoAP XMPP
By Sathish