PrepAway - Latest Free Exam Questions & Answers

Which process describes the lifecycle of a Mapper?

Which process describes the lifecycle of a Mapper?

PrepAway - Latest Free Exam Questions & Answers

A.
The JobTracker calls the TaskTracker’s configure () method, then its map () method and finally
its close () method.

B.
The TaskTracker spawns a new Mapper to process all records in a single input split.

C.
The TaskTracker spawns a new Mapper to process each key-value pair.

D.
The JobTracker spawns a new Mapper to process all records in a single file.

Explanation:
For each map instance that runs, the TaskTracker creates a new instance of your
mapper.
Note:
* The Mapper is responsible for processing Key/Value pairs obtained from the InputFormat. The
mapper may perform a number of Extraction and Transformation functions on the Key/Value pair
before ultimately outputting none, one or many Key/Value pairs of the same, or different Key/Value
type.
* With the new Hadoop API, mappers extend the org.apache.hadoop.mapreduce.Mapper class.
This class defines an ‘Identity’ map function by default – every input Key/Value pair obtained from
the InputFormat is written out.
Examining the run() method, we can see the lifecycle of the mapper:
/**
* Expert users can override this method for more complete control over the
* execution of the Mapper.

* @param context
* @throws IOException
*/
public void run(Context context) throws IOException, InterruptedException {
setup(context);
while (context.nextKeyValue()) {
map(context.getCurrentKey(), context.getCurrentValue(), context);
}
cleanup(context);
}
setup(Context) – Perform any setup for the mapper. The default implementation is a no-op method.
map(Key, Value, Context) – Perform a map operation in the given Key / Value pair. The default
implementation calls Context.write(Key, Value)
cleanup(Context) – Perform any cleanup for the mapper. The default implementation is a no-op
method.
Reference: Hadoop/MapReduce/Mapper

10 Comments on “Which process describes the lifecycle of a Mapper?

  1. satish says:

    Tasktracker creates new Mapper for each input split, and that mapper will run for all the key_value pairs(records) of that input split.
    Answer is B.




    0



    0
  2. satish says:

    Tasktracker creates new Mapper for each input split,and that mapper will run for all the key_value pairs(records) of that input splits.
    Answer is B.




    0



    0

Leave a Reply

Your email address will not be published. Required fields are marked *