You have a 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running HDFS High
Availability (HA). You want to minimize the chance of data loss in you cluster. What should you
do?

A.
Add another master node to increase the number of nodes running the JournalNode which
increases the number of machines available to HA to create a quorum
B.
Configure the cluster’s disk drives with an appropriate fault tolerant RAID level
C.
Run the ResourceManager on a different master from the NameNode in the order to load share
HDFS metadata processing
D.
Run a Secondary NameNode on a different master from the NameNode in order to load provide
automatic recovery from a NameNode failure
E.
Set an HDFS replication factor that provides data redundancy, protecting against failure
A
0
0
E
0
0
c
0
0
E is correct answer.
0
0
E is the correct answer.
Not A since you do not need to run journal nodes on NameNode
Not C – While C is a good idea to help improve performance but does not directly impact chance of data loss.
0
0