You have recently converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to
MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map
and reduce tasks (resource allocation) tasks when they run jobs: A developer wants to know how
specify to reduce tasks when a specific job runs. Which method should you tell that developers to
implement?
A.
MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of
“tasks” into memory and virtual cores, thus eliminating the need for a developer to specify the
number of reduce tasks, and indeed preventing the developer from specifying the number of
reduce tasks.
B.
In YARN, resource allocations is a function of megabytes of memory in multiples of 1024mb.
Thus, they should specify the amount of memory resource they need by executing –D mapreducereduces.memory-mb-2048
C.
In YARN, the ApplicationMaster is responsible for requesting the resource required for a
specific launch. Thus, executing –D yarn.applicationmaster.reduce.tasks=2 will specify that the
ApplicationMaster launch two task contains on the worker nodes.
D.
Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1)
and MapReduce version 2 (MRv2) on YARN. Thus, executing –D mapreduce.job.reduces-2 will
specify reduce tasks.
E.
In YARN, resource allocation is function of virtual cores specified by the ApplicationManager
making requests to the NodeManager where a reduce task is handeled by a single container (and
thus a single virtual core). Thus, the developer needs to specify the number of virtual cores to the
NodeManager by executing –p yarn.nodemanager.cpu-vcores=2