Syllabus and Completed Hadoop Training is Below

Module 1 : Introduction to BigData, Hadoop (HDFS and MapReduce)
: Available (Length 35 Minutes)

1. BigData Inroduction
2. Hadoop Introduction
3. HDFS Introduction
4. MapReduce Introduction

Video URL : http://www.youtube.com/watch?v=R-qjyEn3bjs (View Demo)

Module 2 : Deep Dive in HDFS : Available (Length 48 Minutes) + Useful for CCA175

1. HDFS Design
2. Fundamental of HDFS (Blocks, NameNode, DataNode, Secondary Name Node)
3. Rack Awareness
4. Read/Write from HDFS

5. HDFS Federation and High Availability (Hadoop 2.x.x)
6. Parallel Copying using DistCp
7. HDFS Command Line Interface

Video URL : http://www.youtube.com/watch?v=PK6Im7tBWow (View Demo)

Module 2A : HDFS File Operation Lifecycle (Supplementary) : Available (Length 45 Minutes)

1. File Read Cycel from HDFS
- DistributedFileSystem
- FSDataInputStream
2. Failure or Error Handling When File Reading Fails
3. File Write Cycle from HDFS
- FSDataOutputStream
4. Failure or Error Handling while File write fails

Video URL : http://www.youtube.com/watch?v=Wu2EGfQY-i4 (View Demo)

Module 3 : Understanding MapReduce : Available (Length 60 Minutes)
1. JobTracker and TaskTracker
2. Topology Hadoop cluster
3. Example of MapReduce
Map Function
Reduce Function
4. Java Implementation of MapReduce
5. DataFlow of MapReduce
6. Use of Combiner

Module 4 : MapReduce Internals -1 (In Detail) : Available (Length 57 Minutes)

1. How MapReduce Works
2. Anatomy of MapReduce Job (MR-1)
3. Submission & Initialization of MapReduce Job (What Happen ?)
4. Assigning & Execution of Tasks
5. Monitoring & Progress of MapReduce Job
6. Completion of Job
7. Handling of MapReduce Job
- Task Failure
- TaskTracker Failure
- JobTracker Failure

Module 5 : MapReduce-2 (YARN : Yet Another Resource Negotiator Hadoop 2.x.x ) : Available (Length 52 Minutes)

1. Limitation of Current Architecture (Classic)
2. What are the Requirement ?
3. YARN Architecture
4. JobSubmission and Job Initialization
5. Task Assignment and Task Execution
6. Progress and Monitoring of the Job
7. Failure Handling in YARN
- Task Failure
- Application Master Failure
- Node Manager Failure
- Resource Manager Failure

Module 6 : Advanced Topic for MapReduce (Performance and Optimization) : Available (Length 58 Minutes)

1. Job Sceduling
2. In Depth Shuffle and Sorting
3. Speculative Execution
4. Output Committers
5. JVM Reuse in MR1
6. Configuration and Performance Tuning

Module 7 : Advanced MapReduce Algorithm : Available (Length 87 Minutes)

File Based Data Structure
- Sequence File
- MapFile
Default Sorting In MapReduce
- Data Filtering (Map-only jobs)
- Partial Sorting
Data Lookup Stratgies
- In MapFiles
Sorting Algorithm
- Total Sort (Globally Sorted Data)
- InputSampler
- Secondary Sort

Module 8 : Advanced MapReduce Algorithm -2 : Available : Private (Length 67 Minutes)

1. MapReduce Joining
- Reduce Side Join
- MapSide Join
- Semi Join
2. MapReduce Job Chaining
- MapReduce Sequence Chaining
- MapReduce Complex Chaining

Module 9 : Features of MapReduce : Available : Private (Length 61 Minutes)

Introduction to MapReduce Counters
Types of Counters
Task Counters
Job Counters
User Defined Counters
Propagation of Counters
Side Data Distribution
Using JobConfiguration
Distributed Cache
Steps to Read and Delete Cache File

Module 10: MapReduce DataTypes and Formats : Available : Private (Length 77 Minutes)
1.Serialization In Hadoop
2. Hadoop Writable and Comparable
3. Hadoop RawComparator and Custom Writable
4. MapReduce Types and Formats
5. Understand Difference Between Block and InputSplit
6. Role of RecordReader
7. FileInputFormat
8. ComineFileInputFormat and Processing whole file Single Mapper
9. Each input File as a record
10. Text/KeyValue/NLine InputFormat
11. BinaryInput processing
12. MultipleInputs Format
13. DatabaseInput and Output
14. Text/Biinary/Multiple/Lazy OutputFormat MapReduce Types

Module 11 : Apache Pig : Available (Length 52 Minutes)

1. What is Pig ?
2. Introduction to Pig Data Flow Engine
3. Pig and MapReduce in Detail
4. When should Pig Used ?
5. Pig and Hadoop Cluster
6. Pig Interpreter and MapReduce
7. Pig Relations and Data Types
8. PigLatin Example in Detail
9. Debugging and Generating Example in Apache Pig

Module 11A : Hands On : Apache Pig Coding : Available (Length 23 Minutes)
1. Working with Grunt shell
2. Create word count application
3. Execute word count application
4. Accessing HDFS from grunt shell

Module 11B : Hands On : Apache Pig Complex Datatypes : Available (Length 14 Minutes)
1. Understand Map, Tuple and Bag
2. Create Outer Bag and Inner Bag
3. Defining Pig Schema

Module 11C : Hands On : Apache Pig Data loading : Available (Length 14 Minutes)
1. Understand Load statement
2. Loading csv file
3. Loading csv file with schema
4. Loading Tab separated file
5. Storing back data to HDFS.

Module 11D : Hands On : Apache Pig Statements : Available (Length 8 Minutes)
1. ForEach statement
2. Example 1 : Data projecting and foreach statement
3. Example 2 : Projection using schema
4. Example 3 : Another way of selecting columns using two dots ..

Module 11E : Hands On : Apache Pig Complex Datatype practice : Available (Length 16 Minutes)
1. Example 1 : Loading Complex Datatypes
2. Example 2 : Loading compressed files
3. Example 3 : Store relation as compressed files
4. Example 4 : Nested FOREACH statements to solved same problem.

Module 12 : Fundamental of Apache Hive Part-1 : Available (Length 60 Minutes) + Useful for CCA175

1. What is Hive ?
2. Architecture of Hive
3. Hive Services
4. Hive Clients
5. how Hive Differs from Traditional RDBMS
6. Introduction to HiveQL
7. Data Types and File Formats in Hive
8. File Encoding
9. Common problems while working with Hive

Module 13 : Apache Hive : Available (Length 73 Minutes ) + Useful for CCA175
1. HiveQL
2. Managed and External Tables
3. Understand Storage Formats
4. Querying Data
- Sorting and Aggregation
- MapReduce In Query
- Joins, SubQueries and Views
5. Writing User Defined Functions (UDFs)
3. Data types and schemas
4. Querying Data
5. HiveODBC
6. User-Defined Functions

Module 14 : Understanding NGram algorithm Available (Length 14 Minutes) : Newly Replaced

Module 15 : Hands On : Step by Step Process creating and Configuring eclipse for writing MapReduce Code Available (Length 29 Minutes) : Newly Replaced

Module 16 : Hands On : Analyzing the Result by Running NGram application (UniGram, BiGram, TriGram etc.) Available (Length 19 Minutes) : Newly Replaced

Module 17 : NOSQL Introduction and Implementation : Available (Length 56 Minutes) New

1. What is NoSQL ?
2. NoSQL Characerstics or Common Traits
3. Catgories of NoSQL DataBases
- Key-Value Database
- Document DataBase
- Column Family DataBase
- Graph DataBase
4. Aggregate Orientation : Perfect fit for NoSQl
5. NOSQL Implementation
6. Key-Value Database Example and Use
7. Document DataBase Example and Use
8. Column Family DataBase Example and Use
9. What is Polyglot persistence ?

Module 18 : HBase Introduction : : Available (Part-1 Length 48 Minutes and Part-2 Length-37 Minutes) New

1. Fundamentals of HBase
2. Usage Scenerio of HBase
3. Use of HBase in Search Engine
4. HBase DataModel
- Table and Row
- Column Family and Column Qualifier
- Cell and its Versioning
- Regions and Region Server
5. HBase Designing Tables
6. HBase Data Coordinates
7. Versions and HBase Operation
- Get/Scan
- Put
- Delete

Module 19 : Hands On Creating MapReduce application and deploying on Hadoop Cluster. Available (Length 33 Minutes) : Newly Replaced
1. Creating MapReduce Program
2. Running MapReduce Job
3. Analyzing Resource Manager and looking for the logs

Module 20 : Apache Cassandra : Available (Length 63 Minutes) New

1. BigData and Apache Cassandra
2. Why Cassanra is so Popular
3. Cassandra as a Distributed DataBase
4. Cassandra and High Availability
5. Cassandra and Replication Mechanism
6. Cassandra's Elastic Scalability
7. Tuneable consistency
- Strict Consistency
- Casual Consistency
- Weak Consistency
8. Brewer's CAP Theorem
9. Cassandra as a Scema Free DataBase
10. Where should we use Cassandra
11. Who and why using the Cassandra

Module 21: Hands On MRUnit (MapReduce Testing Framework) : Available (Length 48 Minutes) New

1. Practice Basic MapReduce Without Installing Hadoop Framework
2. Mapper Testing
3. Reducer Testing
4. Counter Testing
5. Full MapReduce Job Testing

Module 22 : Apache Sqoop (SQL To Hadoop) : Available (Length 66 Minutes) New + Useful for CCA175

1. Sqoop Tutorial
2. How does Sqoop Work
3. Sqoop JDBCDriver and Connectors
4. Sqoop Importing Data
5. Various Options to Import Data
- Table Import
- Binary Data Import
- SpeedUp the Import
- Filtering Import
- Full DataBase Import Introduction to Sqoop

Module 23 : Apache Flume : Available (Length 28 Minutes) New

1. Data Acquisition : Apache Flume Introduction
2. Apache Flume Components
3. POSIX and HDFS File Write
4. Flume Events
5. Interceptors, Channel Selectors, Sink Processor

Module 24 : Advanced Apache Flume :Available (Length 48 Minutes) New

1. Sample Twiteer Feed Configuration
2. Flume Channel
- Memory Channel
- File Channel
3. Sinks and Sink Processors
4. Sources
5. Channel Selectors
6. Interceptors

Module 25 : YARN Introduction (Length 52 Mins) Available Hadoop 2.x. YARN Training
1. Why to think Beyond MapReduce
2. New Components of YARN
3. Revisit Hadoop 1.0
4. How YARN fits in Hadoop Framework
5. Hadoop MR1 Components Revisit
6. Need for Non-MapReduce
7. YARN Components Introduction

Module 26 : Fundamental Overview of YARN (Length 40 Mins) Available Hadoop 2.x. YARN Training
1. YARN Functional Component
2. YARN Architecture Overview
3. Claiming and Re-claiming Resources
4. Functional Properties of
Resource Manager
Node Manager
Application Master
5. YARN Scheduling Component
6. Introduction to FIFO Scheduler
7. Introduction to Capacity Scheduler

Module 27 : Powerfull Hadoop 2.0 Framework (Length : 27 Mins) Available Hadoop 2.x. YARN Training
1. HDFS 1.0 Versus Hadoop 2.0
2. Resource Manager - Subcomponent
3. Details About Fair Share Scheduler
4. Hierarchical Queues in Scheduler
5. Containers
6. Node Manager and Its Responsbility
7. Role of Application Master while submitting Jobs

Module 28 : Submitting the Application to YARN Hadoop Cluster (Length : 27 Mins) Available Hadoop 2.x. YARN Training
1. Submitting the Application to YARN Hadoop Cluster
2. Managing Application Dependencies
3. Writing a YARN Application : Birdseye View

Module 29 : LocalResources of the Application Available Hadoop 2.x. YARN Training
1. Understanding of YARN Application/Jobs Dependencies
2. Types of LocalResource
3. Visibilites of Local Resources
4. Lifetime of Local Resources
5. Good and Bad Local Resources
6. Target Directories of Local Resources

Module 30 : Deep Dive in Capacity Schedular (Length 39 Mins) Available Hadoop 2.x. YARN Training
1. Introduction and Enabling Capacity Schedular
2. Setting Up Quesues in the CS
3. Access Control List Setup
4. Managing Cluster Capacity in with Queues
5. Resource Distribution Workflow Example

Module 31 : Managing Capacity Schedular (Length 39 Mins) Available Hadoop 2.x. YARN Training
1. Managing Capacity with Queues
2. Resource Distribution Example
3. Understanding User Limits
4. Application Reservation
5. Understanding the Preemption

Module 32 : Hadoop Security : Kerberos Authentication (Length 23 Mins) Available Hadoop Security Training
1. Kerberos Authentication
2. Important entity of Kerberos Autherization
3. How Kerberos Process works

Module 33 : Apache Spark : Introduction to Apache Spark (Length 48 Mins) Available 100 Time Faster Data Processing + Useful for CCA175
1. Introduction to Apache Spark
2. Features of Apache Spark
3. Apache Spark Stack
4. Introduction to RDD's
5. RDD's Transformation
6. What is Good and Bad In MapReduce
7. Why to use Apache Spark

Module 34 : Cloudera QuickStart VM Step By Step Installation (Length 19 Mins) Available + Steps in PDF+ Hands On Lab
1. It Includes Hadoop 2.0
2. YARN
3. Hive
4. Pig
5. Hue
6. Apache Spark
7. Workflow

Module 35 : Load data in HDFS using the HDFS commands (Length 35 Mins) Available + Steps in PDF + Hands On Lab + Useful for CCA175

Module 36 : Importing Data from RDBMS to HDFS (Length 21 Mins) Available + Steps in PDF+Hands On Lab + Useful for CCA175
1. Without Specifying Directory
2. With target Directory
3. With warehouse directory

Module 37 : Sqoop Import Module (Length 41 Mins) Available + Steps in PDF +Hands On Lab + Useful for CCA175
1. Importing Subset of data from RDBMS
2. Chnaging the delimiter during Import
3. Encoding Null values
4. Importing Entire schema or all tables

Module 38 : Importing data to HIve Using Sqoop (Length 41 Mins) Available +Steps in PDF + Hands On Lab + Useful for CCA175

Module 39 : Apache Avro Introduction (Length 26 Mins) Available + PDF Download + Useful for CCA175
1. Why Avro files
2. Avro file Serialization and Deserialization
3. Adding fields
4. Deleting fields

Module 40 : Apache Avro Schema In Depth (Length 12 Mins) Available + PDF Download + Useful for CCA175
1. Avro schema example
2. Avro embedded schema
3. Avro schema primitive data types
4. Avro schema Complex data types
Record, Map, Array, Union, Enum, Fixed etc.

Module 41 : Apache Avro Schema Evolution (Length 16 Mins) Available + PDF Download + Useful for CCA175
1. Understand Avro Schema Evolution
2. Reader Schema and Writer Schema
3. JSON schema Adding new fields
4. JSON schema removing a filed


All above 41 modules are available and ready to Watch/Learn (To Buy go on Top) DEF

If you want to get access to only this particular certification preparation material than subscribe below.

Another good annual package, which is subscribed by user who are interested in more technology learning including Spark, Hadoop, Cassandra, Sacala and much more with below annual subscription, which include any two certification preparation material.



Hadoop Annual Subscription



Looking for NoSQL Cassandra Certification preparation material, check below available option.





       

      Recommended Package for  Certification with the Training








      Click to View What Learners Say about us : Testimonials

      We have training subscriber from TCS, IBM, INFOSYS, ACCENTURE, APPLE, HEWITT, Oracle , NetApp , Capgemini etc.


      One of testimonials from training subscriber :

      I really enjoy all the training you provide, so do you have any training on Data Science? I searched in the website could not find one, I would be happy to join if you send me the link.

      Thanks,
      A**tha

      Repeat Customer email :
      Hi

      I have gone through Apache scala and spark training videos. The concepts explained very well in depth. I would like to know following details 
      1. I am interested for on Training module of Pig and Hive. While checking  found that "Hadoop Professional Training" covers pig and hive modules but not found separately.  Can I get pig and Hive module access only ? or I need to go for complete "Hadoop Professional Training" ?
      2. In addition to that, I need inputs from you. I need to go for Cloudera certificate but while checking found CCD410 "Hadoop Developer" is obsolete so if I go for "MapR Hadoop Developer Certification", what is market value? is it good to go for this exam? then interested for "MapR Hadoop Developer Certification"  Simulator  also
      I would like to know the cost for above 1 + 2.

      Thanks
      Vip*l P*tel

      Read all testimonials its learners voice :
      Testimonials