What should you do?

seenagapeApril 14, 2017

You have A 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running HDFS High

Availability (HA). You want to minimize the chance of data loss in your cluster. What should you
do?

PrepAway - Latest Free Exam Questions & Answers

A.
Add another master node to increase the number of nodes running the JournalNode which
increases the number of machines available to HA to create a quorum

B.
Set an HDFS replication factor that provides data redundancy, protecting against node failure

C.
Run a Secondary NameNode on a different master from the NameNode in order to provide
automatic recovery from a NameNode failure.

D.
Run the ResourceManager on a different master from the NameNode in order to load-share
HDFS metadata processing

E.
Configure the cluster’s disk drives with an appropriate fault tolerant RAID level

13 Comments on “What should you do?”

Gaurav says:

May 25, 2015 at 4:20 pm

B & D

Aneesh Mohan says:

June 15, 2015 at 11:00 am

Answer “B”

b says:

July 22, 2015 at 12:17 pm

I don’t think D adds fault tolerance. It just reduces the load on a master node, but that does not need to be necessary at such a small cluster.

Having more than 2 Journal Nodes, however, adds more fault-tolerance to the NameNode metadata, which is why A should be correct.

chris gang says:

April 1, 2016 at 5:34 am

I agree with A is the correct answer, “Add another master node to increase the number of nodes running the JournalNode which increases the number of machines available to HA to create a quorum” , however here it shouldn’t be another master node , what is master node ? I have no idea , we only have name node or data node , if we change master node into journalnode , that will be perfect answer.

0

0

Reply
1. chris gang says:
  
  April 1, 2016 at 5:35 am
  
  https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
  
  0
  
  0
  
  Reply

Chris says:

October 13, 2015 at 3:22 pm

I think B is the only answer that makes any sense. Read the question carefully, because you can have HA without setting the proper data replication factor. And data replication is directly related to potential data loss. Having a ResourceManager only relates to YARN functionality and is required for HA anyway.

ashfaque says:

March 8, 2016 at 12:14 pm

B is correct.

chris gang says:

April 1, 2016 at 7:02 am

I think E is the best answer , with even namenode HA configured ,you still worry about the risk of losing data, the only way is to use RAID .

venky says:

January 2, 2017 at 1:54 am

why cant it be C ?
NN and SNN should not be in same master . SNN should run in different master

Emmanuel says:

May 9, 2017 at 2:08 pm

from hadoop https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html

“Note that, in an HA cluster, the Standby NameNode also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.”

0

0

Reply

Pramod Kumar says:

January 10, 2017 at 10:54 pm

As per hadoop documentation maximum two name nodes can be configured.
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html

dfs.ha.namenodes.[nameservice ID] – unique identifiers for each NameNode in the nameservice

Configure with a list of comma-separated NameNode IDs. This will be used by DataNodes to determine all the NameNodes in the cluster. For example, if you used “mycluster” as the nameservice ID previously, and you wanted to use “nn1” and “nn2” as the individual IDs of the NameNodes, you would configure this as such:

dfs.ha.namenodes.mycluster
nn1,nn2

Note: Currently, only a maximum of two NameNodes may be configured per nameservice.

Hence “A” is not a valid option.

mr_tienvu says:

April 15, 2017 at 12:56 am

I have the same idea. D

pangi says:

May 8, 2017 at 1:09 am

E is a best answer

Get 50% Discount on All Your Purchases
at PrepAway.com - Latest Exam Questions

This is ONE TIME OFFER

Enter your email address to receive your 50% off dicount code:

SPECIAL OFFER: GET 50% OFF

Use Discount Code:

Briefing Cloudera Knowledge

Free Cloudera Study Guide

What should you do?

13 Comments on “What should you do?”

Leave a Reply Cancel reply