How many Mappers will run?
On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10
plain text as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will
run?
Which ecosystem project should you use to perform these actions?
You are working on a project where you need to chain together MapReduce, Pig jobs. You also
needs the ability to use forks, decision, and path joins. Which ecosystem project should you use to
perform these actions?
Identify two features/issues that YARN is designed to address:
Identify two features/issues that YARN is designed to address:
What processes must you do if you are running a Hadoop cluster with a single NameNode and six DataNodes…
What processes must you do if you are running a Hadoop cluster with a single NameNode and six
DataNodes, and you want to change a configuration parameter so that it affects all six DataNodes.
How does this alter HDFS block storage?
A slave node in your cluster has four 2TB hard drives installed (4 x 2TB). The DataNode is
configured to store HDFS blocks on the disks. You set the value of the dfs.datanode.du.reserved
parameter to 100GB. How does this alter HDFS block storage?
Which configuration should you set?
Your cluster is running MapReduce vserion 2 (MRv2) on YARN. Your ResourceManager is
configured to use the FairScheduler. Now you want to configure your scheduler such that a new
user on the cluster can submit jobs into their own queue application submission. Which
configuration should you set?
What is the cause of the error?
A user comes to you, complaining that when she attempts to submit a Hadoop job, it fails. There is
a directory in HDFS named /data/input. The Jar is named j.jar, and the driver class is named
DriverClass. She runs command:
hadoop jar j.jar DriverClass /data/input/data/output
The error message returned includes the line:
PrivilegedActionException as:training (auth:SIMPLE)
cause.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exits: file
:/data/input
What is the cause of the error?
which file contains a serialized form of all the directory and files inodes in the filesystem, giving the Name
In CDH4 and later, which file contains a serialized form of all the directory and files inodes in the
filesystem, giving the NameNode a persistent checkpoint of the filesystem metadata?
Which process instantiates user code, and executes map and reduce tasks on a cluster running MapReduce V2 (MRv
Which process instantiates user code, and executes map and reduce tasks on a cluster running
MapReduce V2 (MRv2) on YARN?
What should you do?
You are migrating a cluster from MapReduce version 1 (MRv1) to MapReduce version2 (MRv2) on
YARN. To want to maintain your MRv1 TaskTracker slot capacities when you migrate. What
should you do?