Apache Hadoop Introduction
- Bigdata Introduction
- Why Hadoop?
- Fundamental Concepts
- Core Hadoop Components
- HDFS
- MapReduce
- Hive
- PIG
- Sqoop
- Flume
- Sqoop
- HBase
- Spark etcâ?¦
Â
Hadoop Cluster Installation
- Rationale for a Cluster Management Solution
- Cloudera Manager Features
- Cloudera Manager Installation
- Hadoop (CDH) Installation
- The Hadoop Distributed File System (HDFS)
Â
The Hadoop Distributed File System (HDFS)
- HDFS Features
- Writing and Reading Files
- NameNode Memory Considerations
- Overview of HDFS Security
- Web UIs for HDFS
- Using the Hadoop File Shell
Â
MapReduce and Spark on YARN
- The Role of Computational Frameworks
- YARN: The Cluster Resource Manager
- MapReduce Concepts
- Apache Spark Concepts
- Running Computational Frameworks on YARN
- Exploring YARN Applications Through the
- Web UIs, and the Shell
- YARN Application Logs
  Â
 Hadoop Configuration and Daemon Logs
- Cloudera Manager Constructs for Managing Configurations
- Locating Configurations and Applying Configuration Changes
- Managing Role Instances and Adding Services
- Configuring the HDFS Service
- Configuring Hadoop Daemon Logs
- Configuring the YARN Service
 Â
 Getting Data Into HDFS
- Ingesting Data From External Sources With Flume
- Ingesting Data From Relational Databases With Sqoop
- REST Interfaces
- Best Practices for Importing Data
Â
Planning Your Hadoop Cluster
- General Planning Considerations
- Choosing the Right Hardware
- Virtualization Options*
- Network Considerations
- Configuring Nodes
Â
Installing and Configuring Hive, Impala, Spark and Pig
- Hive
- Impala
- Spark
- Pig
Â
Hadoop Clients Including Hue
- What Are Hadoop Clients?
- Installing and Configuring Hadoop Clients
- Installing and Configuring Hue
- Hue Authentication and Authorization
Â
Advanced Cluster Configuration
- Advanced Configuration Parameters
- Configuring Hadoop Ports
- Configuring HDFS for Rack Awareness
- Configuring HDFS High Availability
Â
Hadoop Security
- Why Hadoop Security Is Important
- Hadoopâ??s Security System Concepts
- What Kerberos Is and how it Works
- Securing a Hadoop Cluster With Kerberos
- Other Security Concepts
Â
Managing Resources
- Configuring cgroups with Static Service Pools
- The Fair Scheduler
- Configuring Dynamic Resource Pools
- YARN Memory and CPU Settings
- Impala Query Scheduling
Â
Cluster Maintenance
- Checking HDFS Status
- Copying Data Between Clusters
- Adding and Removing Cluster Nodes
- Rebalancing the Cluster
- Directory Snapshots
- Cluster Upgrading
Â
Cluster Monitoring and Troubleshooting
- Cloudera Manager Monitoring Features
- Monitoring Hadoop Clusters
- Troubleshooting Hadoop Clusters
- Common Misconfigurations
Â
Course Deliverables
- Workshop style coaching
- Interactive approach
- Course material
- POC Implementation
- Hands on practice exercises for each topic
- Quiz at the end of each major topic
- Tips and techniques on Cloudera Certification Examination