PrepAway - Latest Free Exam Questions & Answers

What does this tells you?

Cluster Summary
45 files and directories, 12 blocks = 57 total. Heap Size is 15.31 MB / 193.38MB(7%)

Refer to the above screenshot.
You configure the Hadoop cluster with seven DataNodes and the NameNode’s web UI displays
the details shown in the exhibit.
What does this tells you?

PrepAway - Latest Free Exam Questions & Answers

A.
The HDFS cluster is in the safe mode.

B.
Your cluster has lost all HDFS data which had blocks stored on the dead DataNode.

C.
One physical host crashed.

D.
The DataNode JVM on one host is not active.

Explanation:
The data from the dead node is being replicated. The cluster is in safemode.
Note:
* Safemode
During start up Namenode loads the filesystem state from fsimage and edits log file. It then waits
for datanodes to report their blocks so that it does not prematurely start replicating the blocks
though enough replicas already exist in the cluster. During this time Namenode stays in safemode.
A Safemode for Namenode is essentially a read-only mode for the HDFS cluster, where it does
not allow any modifications to filesystem or blocks. Normally Namenode gets out of safemode
automatically at the beginning. If required, HDFS could be placed in safemode explicitly using
‘bin/hadoop dfsadmin -safemode’ command. Namenode front page shows whether safemode is on
or off. A more detailed description and configuration is maintained as JavaDoc for setSafeMode().
* Data Disk Failure, Heartbeats and Re-Replication
Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition
can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode
detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes
without recent Heartbeats as dead and does not forward any new IO requests to them. Any data

that was registered to a dead DataNode is not available to HDFS any more. DataNode death may
cause the replication factor of some blocks to fall below their specified value. The NameNode
constantly tracks which blocks need to be replicated and initiates replication whenever necessary.
The necessity for re-replication may arise due to many reasons: a DataNode may become
unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the
replication factor of a file may be increased.
* NameNode periodically receives a Heartbeat and a Blockreport from each of the DataNodes in
the cluster. Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport
contains a list of all blocks on a DataNode. When NameNode notices that it has not recieved a
hearbeat message from a data node after a certain amount of time, the data node is marked as
dead. Since blocks will be under replicated the system begins replicating the blocks that were
stored on the dead datanode. The NameNode Orchestrates the replication of data blocks from one
datanode to another. The replication data transfer happens directly between datanodes and the
data never passes through the namenode.
Incorrrect answers:
B: The data is not lost, it is being replicated.
24 Interview Questions & Answers for Hadoop MapReduce developers, How
NameNode Handles data node failures?

3 Comments on “What does this tells you?


Leave a Reply

Your email address will not be published. Required fields are marked *