PrepAway - Latest Free Exam Questions & Answers

which two issues?

MapReduce v2 (MRv2/YARN) is designed to address which two issues?

PrepAway - Latest Free Exam Questions & Answers

A.
Single point of failure in the NameNode.

B.
Resource pressure on the JobTracker.

C.
HDFS latency.

D.
Ability to run frameworks other than MapReduce, such as MPI.

E.
Reduce complexity of the MapReduce APIs.

F.
Standardize on a single MapReduce API.

Explanation:
YARN (Yet Another Resource Negotiator), as an aspect of Hadoop, has two major
kinds of benefits:
* (D) The ability to use programming frameworks other than MapReduce.
/ MPI (Message Passing Interface) was mentioned as a paradigmatic example of a MapReduce
alternative
* Scalability, no matter what programming framework you use.

Note:
* The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker,
resource management and job scheduling/monitoring, into separate daemons. The idea is to have
a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is
either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.
* (B) The central goal of YARN is to clearly separate two things that are unfortunately smushed
together in current Hadoop, specifically in (mainly) JobTracker:
/ Monitoring the status of the cluster with respect to which nodes have which resources available.
Under YARN, this will be global.
/ Managing the parallelization execution of any specific job. Under YARN, this will be done
separately for each job.
The current Hadoop MapReduce system is fairly scalable — Yahoo runs 5000 Hadoop jobs, truly
concurrently, on a single cluster, for a total 1.5 – 2 millions jobs/cluster/month. Still, YARN will
remove scalability bottlenecks
Reference: Apache Hadoop YARN – Concepts & Applications

2 Comments on “which two issues?


Leave a Reply

Your email address will not be published. Required fields are marked *