Which method should you tell that developer to implement?

seenagape

11 years ago

You have converted your Hadoop cluster from a MapReduce 1 (MRv1) architecture to a
MapReduce 2 (MRv2) on YARN architecture. Your developers are accustomed to specifying map
and reduce tasks (resource allocation) tasks when they run jobs. A developer wants to know how
specify to reduce tasks when a specific job runs. Which method should you tell that developer to
implement?

A.
Developers specify reduce tasks in the exact same way for both MapReduce version 1 (MRv1)
and MapReduce version 2 (MRv2) on YARN. Thus, executing –p mapreduce.job.reduce-2 will
specify 2 reduce tasks.

B.
In YARN, the ApplicationMaster is responsible for requesting the resources required for a
specific job. Thus, executing –p yarn.applicationmaster.reduce.tasks-2 will specify that the
ApplicationMaster launch two task containers on the worker nodes.

C.
In YARN, resource allocation is a function of megabytes of memory in multiple of 1024mb.
Thus, they should specify the amount of memory resource they need by executing –D
mapreduce.reduce.memory-mp-2040

D.
In YARN, resource allocation is a function of virtual cores specified by the ApplicationMaster
making requests to the NodeManager where a reduce task is handled by a single container (and
this a single virtual core). Thus, the developer needs to specify the number of virtual cores to the
NodeManager by executing –p yarn.nodemanager.cpu-vcores=2

E.
MapReduce version 2 (MRv2) on YARN abstracts resource allocation away from the idea of
“tasks” into memory and virtual cores, thus eliminating the need for a developer to specify the
number of reduce tasks, and indeed preventing the developer from specifying the number of
reduce tasks.