Question-46: How does a NameNode handle the failure of the data nodes?

Answer: HDFS has master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. 

In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. 

The NameNode and DataNode are pieces of software designed to run on commodity machines. 

NameNode periodically receives a Heartbeat and a Block report from each of the DataNodes in the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode. When NameNode notices that it has not received a heartbeat message from a data node after a certain amount of time, the data node is marked as dead. Since blocks will be under replicated the system begins replicating the blocks that were stored on the dead DataNode. The NameNode Orchestrates the replication of data blocks from one DataNode to another. The replication data transfer happens directly between DataNode and the data never passes through the NameNode. 

Question-47:  Can Reducer talk with each other?

Answer: No, Reducer runs in isolation. 

Question-48. Where the Mapper’s Intermediate data will be stored?

Answer: The mapper output (intermediate data) is stored on the Local file system (NOT HDFS) of each individual mapper nodes. This is typically a temporary directory location which can be setup in config by the Hadoop administrator. The intermediate data is cleaned up after the Hadoop Job completes. 

Question-49: What is the use of Combiners in the Hadoop framework?

Answer: Combiners are used to increase the efficiency of a MapReduce program. They are used to aggregate intermediate map output locally on individual mapper outputs. Combiners can help you reduce the amount of data that needs to be transferred across to the reducers.

You can use your reducer code as a combiner if the operation performed is commutative and associative.

The execution of combiner is not guaranteed. Hadoop may or may not execute a combiner. Also, if required it may execute it more than 1 times. Therefore, your MapReduce jobs should not depend on the combiners’ execution. 

Question-50: What is the Hadoop MapReduce API contract for a key and value Class?

Answer:  The   Key   must   implement   the   org.apache.hadoop.io.WritableComparable interface.

The value must implement the org.apache.hadoop.io.Writable interface.