PrepAway - Latest Free Exam Questions & Answers

In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task run:

In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task
run:

PrepAway - Latest Free Exam Questions & Answers

A.
In the same Java Virtual Machine as the DataNode.

B.
In the same Java Virtual Machine as the TaskTracker

C.
In its own Java Virtual Machine.

D.
In the same Java Virtual Machine as the JobTracker.

Explanation:
A TaskTracker is a slave node daemon in the cluster that accepts tasks (Map,
Reduce and Shuffle operations) from a JobTracker. There is only One Task Tracker process run
on any hadoop slave node. Task Tracker runs on its own JVM process. Every TaskTracker is
configured with a set of slots, these indicate the number of tasks that it can accept. The
TaskTracker starts a separate JVM processes to do the actual work (called as Task Instance) this
is to ensure that process failure does not take down the task tracker. The TaskTracker monitors
these task instances, capturing the output and exit codes. When the Task instances finish,
successfully or not, the task tracker notifies the JobTracker. The TaskTrackers also send out
heartbeat messages to the JobTracker, usually every few minutes, to reassure the JobTracker that
it is still alive. These message also inform the JobTracker of the number of available slots, so the
JobTracker can stay up to date with where in the cluster work can be delegated.
Note: Despite this very high level of reliability, HDFS has always had a well-known single point of
failure which impacts HDFS’s availability: the system relies on a single Name Node to coordinate
access to the file system data. In clusters which are used exclusively for ETL or batch-processing
workflows, a brief HDFS outage may not have immediate business impact on an organization;
however, in the past few years we have seen HDFS begin to be used for more interactive
workloads or, in the case of HBase, used to directly serve customer requests in real time. In cases
such as this, an HDFS outage will immediately impact the productivity of internal users, and
perhaps result in downtime visible to external users. For these reasons, adding high availability
(HA) to the HDFS Name Node became one of the top priorities for the HDFS community.
24 Interview Questions & Answers for Hadoop MapReduce developers , What is a
Task Tracker in Hadoop? How many instances of TaskTracker run on a Hadoop Cluster

One Comment on “In a cluster configured with HDFS High Availability (HA) but NOT HDFS federation, each map task run:


Leave a Reply

Your email address will not be published. Required fields are marked *