Which workloads benefit the most from faster network fabric?
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as
the network fabric. Which workloads benefit the most from faster network fabric?
Which configuration should you set?
Your cluster is running MapReduce version 2 (MRv2) on YARN. Your ResourceManager is
configured to use the FairScheduler. Now you want to configure your scheduler such that a
new user on the cluster can submit jobs into their own queue application submission. Which
configuration should you set?
How does this alter HDFS block storage?
A slave node in your cluster has 4 TB hard drives installed (4 x 2TB). The DataNode is
configured to store HDFS blocks on all disks. You set the value of the
dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?
What do you have to do on the cluster to allow the worker node to join, and start sorting HDFS blocks?
You have installed a cluster HDFS and MapReduce version 2 (MRv2) on YARN. You have
no dfs.hosts entry(ies) in your hdfs-site.xml configuration file. You configure a new worker
node by setting fs.default.name in its configuration files to point to the NameNode on your
cluster, and you start the DataNode daemon on that worker node. What do you have to do
on the cluster to allow the worker node to join, and start sorting HDFS blocks?
What two processes must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes
What two processes must you do if you are running a Hadoop cluster with a single
NameNode and six DataNodes, and you want to change a configuration parameter so that it
affects all six DataNodes.
How will the cluster handle the replication of file in this situation?
You use the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small
enough that it fits into a single block, which is replicated to three nodes in your cluster (with
a replication factor of 3). One of the nodes holding this file (a single block) fails. How will the
cluster handle the replication of file in this situation?
What command you enter?
Given: You want to clean up this list by removing jobs where the State is KILLED. What
command you enter?
What happens when you issue the third command?
Assume you have a file named foo.txt in your local directory. You issue the following three
commands: Hadoop fs –mkdir input Hadoop fs –put foo.txt input/foo.txt Hadoop fs –put
foo.txt input What happens when you issue the third command?
How must you format underlying file system of each DataNode?
You are configuring a server running HDFS, MapReduce version 2 (MRv2) on YARN
running Linux. How must you format underlying file system of each DataNode?
What should you do/
You are migrating a cluster from MApReduce version 1 (MRv1) to MapReduce version 2
(MRv2) on YARN. You want to maintain your MRv1 TaskTracker slot capacities when you
migrate. What should you do/