cloudera administrator training for apache hadoop v2
DESCRIPTION
Training for Cloudera Hadoop Admin TrainingTRANSCRIPT
-
Administrator Training for Apache Hadoop
Take your knowledge to the next level with Clouderas Apache Hadoop Training and Certification
Cloudera Universitys three-day administrator training course for Apache Hadoop provides system administrators a comprehensive understanding of all the steps necessary to operate and manage Hadoop clusters. From installation and configuration through load balancing and tuning your cluster, Clouderas administration course has you covered.
Through lecture and interactive, hands-on exercises, attendees will cover topics such as:
> Introduction to Apache Hadoop and HDFS> Apache Hadoop architecture> Proper cluster configuration and deployment> Populating HDFS using Apache Sqoop> Management and monitoring tools> Job scheduling> Best practices for maintaining Apache Hadoop in production> Installing and managing other Apache Hadoop projects> Diagnosing, tuning and solving Apache Hadoop issues
Upon completion of the course, attendees receive a voucher for a Cloudera Certified Administrator for Apache Hadoop (CCAH) exam. Certification is a great differentiator; it helps establish individuals as leaders in their field, providing customers with tangible evidence of skills and expertise.
AUDIENCE This course is designed for people with at least a basic level of Linux system administration experience. Prior knowledge of Hadoop is not required.
Cloudera Administrator Training for Apache Hadoop helped me to advance my use of Apache Hadoop and cultivate a better understanding of the platforms inner workings. The course material, interactive labs and exercises really helped cement together all the little bits and pieces that I had bumped into prior to the class into a useful mental model of how Apache Hadoop works.
ERIC MARSHALL,SENIOR SYSTEM ADMINISTRATOR
TRAINING SHEET
-
Course Outline: Cloudera Administrator Training for Apache Hadoop
Introduction
The Case for Apache Hadoop> A Brief History of Hadoop> Core Hadoop Components> Fundamental Concepts
The Hadoop Distributed File System> HDFS Features> HDFS Design Assumptions> Overview of HDFS Architecture> Writing and Reading Files> NameNode Considerations> An Overview of HDFS Security> Hands-On Exercise
MapReduce> What Is MapReduce?> Features of MapReduce> Basic MapReduce Concepts> Architectural Overview> MapReduce Version 2> Failure Recovery> Hands-On Exercise
An Overview of the Hadoop Ecosystem> What is the Hadoop Ecosystem?> Integration Tools> Analysis Tools> Data Storage and Retrieval Tools
Planning your Hadoop Cluster> General planning Considerations> Choosing the Right Hardware> Network Considerations> Configuring Nodes
Hadoop Installation> Deployment Types> Installing Hadoop> Using Cloudera Manager
for Easy Installation> Basic Configuration Parameters> Hands-On Exercise
Advanced Configuration> Advanced Parameters> Configuring Rack Awareness> Configuring Federation> Configuring High Availability> Using Configuration
Management Tools
Hadoop Security> Why Hadoop Security Is Important> Hadoops Security System Concepts> What Kerberos Is and How it Works> Configuring Kerberos Security> Integrating a Secure Cluster
with Other Systems
Managing and Scheduling Jobs> Managing Running Jobs> Hands-On Exercise> The FIFO Scheduler> The FairScheduler> Configuring the FairScheduler> Hands-On Exercise
Cluster Maintenance> Checking HDFS Status> Hands-On Exercise> Copying Data Between Clusters> Adding and Removing
Cluster Nodes> Rebalancing the Cluster> Hands-On Exercise> NameNode Metadata Backup> Cluster Upgrading
Cluster Monitoring and Troubleshooting > General System Monitoring> Managing Hadoops Log Files> Using the NameNode and
JobTracker Web UIs> Hands-On Exercise> Cluster Monitoring with Ganglia> Common Troubleshooting Issues> Benchmarking Your Cluster
Populating HDFS From External Sources> An Overview of Flume> Hands-On Exercise> An Overview of Sqoop> Best Practices for Importing Data
Installing and Managing Other Hadoop Projects> Hive> Pig> HBase
Conclusion
Appendix: Kerberos Configuration
TRAINING SHEET
-
2012 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.
Cloudera, Inc. 220 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com
Cloudera Certified Administrator for Apache Hadoop (CCAH)Establish yourself as a trusted and valuable resource by completing the certification exam for Apache Hadoop administrators. CCAH certifies the core system administrator skills sought by companies and organizations deploying Apache Hadoop. The exam can be demanding and will test your fluency with concepts and terminology in the following areas:
Hadoop Distributed File System (HDFS)Recognize and identify daemons and understand the normal operation of an Apache Hadoop cluster, both in data storage and in data processing. Describe the current features of computing systems that motivate a system like Apache Hadoop:> HDFS Design> HDFS Daemons> HDFS Federation> HDFS HA> Securing HDFS (Kerberos)> File Read and Write Paths
MapReduce Understand MapReduce core concepts and MapReduce v2 (MRv2 / YARN).
Apache Hadoop Cluster Planning Discuss the principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster.
Apache Hadoop Cluster Installation and Administration Analyze cluster handling of disk and machine failures. Recognize and identify regular tools for monitoring and managing HDFS.
Resource Management Describe how the default FIFO scheduler and the FairScheduler handle the tasks in a mix of jobs running on a cluster.
Monitoring and Logging Discuss the functions and features of Apache Hadoops logging and monitoring systems.
Ecosystem Understand ecosystem projects and what you need to do to deploy them on a cluster.
TRAINING SHEET