PrepAway - Latest Free Exam Questions & Answers

Indentify what determines the data types used by the Mapper for a given job.

You are developing a MapReduce job for sales reporting. The mapper will process input keys
representing the year (IntWritable) and input values representing product indentifies (Text).
Indentify what determines the data types used by the Mapper for a given job.

PrepAway - Latest Free Exam Questions & Answers

A.
The key and value types specified in the JobConf.setMapInputKeyClass and
JobConf.setMapInputValuesClass methods

B.
The data types specified in HADOOP_MAP_DATATYPES environment variable

C.
The mapper-specification.xml file submitted with the job determine the mapper’s input key and
value types.

D.
The InputFormat used by the job determines the mapper’s input key and value types.

Explanation:
The input types fed to the mapper are controlled by the InputFormat used. The
default input format, “TextInputFormat,” will load data in as (LongWritable, Text) pairs. The long
value is the byte offset of the line in the file. The Text object holds the string contents of the line of
the file.
Note: The data types emitted by the reducer are identified by setOutputKeyClass()
andsetOutputValueClass(). The data types emitted by the reducer are identified by
setOutputKeyClass() and setOutputValueClass().
By default, it is assumed that these are the output types of the mapper as well. If this is not the
case, the methods setMapOutputKeyClass() and setMapOutputValueClass() methods of the
JobConf class will override these.
Reference: Yahoo! Hadoop Tutorial, THE DRIVER METHOD

One Comment on “Indentify what determines the data types used by the Mapper for a given job.


Leave a Reply

Your email address will not be published. Required fields are marked *