how much data will you be able to store?
Your Hadoop cluster has 25 nodes with a total of 100 TB (4 TB per node) of raw disk space
allocated HDFS storage. Assuming Hadoop’s default configuration, how much data will you be
able to store?
How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
You observe that the number of spilled records from map tasks for exceeds the number of map
output records. You child heap size is 1 GB and your io.sort.mb value is set to 100MB. How would
you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
The most important consideration for slave nodes in a Hadoop cluster running production jobs that require shor
The most important consideration for slave nodes in a Hadoop cluster running production jobs that
require short turnaround times is:
Identify two features/issues that MapReduce v2 (MRv2/YARN) is designed to address:
Identify two features/issues that MapReduce v2 (MRv2/YARN) is designed to address:
why should you run the HDFS balancer periodically?
Choose three reasons why should you run the HDFS balancer periodically?
which daemon makes HDFS unavailable on a cluster running MapReduce v1 (MRv1)?
The failure of which daemon makes HDFS unavailable on a cluster running MapReduce v1
(MRv1)?
Where does a MapReduce job store the intermediate data output from Mappers?
Where does a MapReduce job store the intermediate data output from Mappers?
In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task run:
In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task
run:
What additional capability does Ganglia provide to monitor a Hadoop?
What additional capability does Ganglia provide to monitor a Hadoop?
How much disk space can your new nodes contain?
Your existing Hadoop cluster has 30 slave nodes, each of which has 4 x 2T hard drives. You plan
to add another 10 nodes. How much disk space can your new nodes contain?