Your client application submits a MapReduce job to your Hadoop cluster. The Hadoop framework
looks for an available slot to schedule the MapReduce operations on which of the following
Hadoop computing daemons?
JobTracker is the daemon service for submitting and tracking MapReduce jobs in
Hadoop. There is only One Job Tracker process run on any hadoop cluster. Job Tracker runs on
its own JVM process. In a typical production cluster its run on a separate machine. Each slave
node is configured with job tracker node location. The JobTracker is single point of failure for the
Hadoop MapReduce service. If it goes down, all running jobs are halted. JobTracker in Hadoop
performs following actions(from Hadoop Wiki:)
Client applications submit jobs to the Job tracker.
The JobTracker talks to the NameNode to determine the location of the data
The JobTracker locates TaskTracker nodes with available slots at or near the data
The JobTracker submits the work to the chosen TaskTracker nodes.
The TaskTracker nodes are monitored. If they do not submit heartbeat signals often enough, they
are deemed to have failed and the work is scheduled on a different TaskTracker.
A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides what to do
then: it may resubmit the job elsewhere, it may mark that specific record as something to avoid,
and it may may even blacklist the TaskTracker as unreliable.
When the work is completed, the JobTracker updates its status.
Client applications can poll the JobTracker for information.
Reference:24 Interview Questions & Answers for Hadoop MapReduce developers,What is a
JobTracker in Hadoop? How many instances of JobTracker run on a Hadoop Cluster?