PrepAway - Latest Free Exam Questions & Answers

which of the following interfaces is most likely to reduce the amount of intermediate data transferred across

You’ve written a MapReduce job that will process 500 million input records and generate 500
million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a
significant amount of intermediate data that it needs to transfer between mappers and reducers
which is a potential bottleneck. A custom implementation of which of the following interfaces is
most likely to reduce the amount of intermediate data transferred across the network?

PrepAway - Latest Free Exam Questions & Answers

A.
Writable

B.
WritableComparable

C.
InputFormat

D.
OutputFormat

E.
Combiner

F.
Partitioner

Explanation:
Users can optionally specify a combiner, via JobConf.setCombinerClass(Class), to
perform local aggregation of the intermediate outputs, which helps to cut down the amount of data
transferred from the Mapper to the Reducer.
Reference:Map/Reduce Tutorial
http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html(Mapper, 9th paragraph)

3 Comments on “which of the following interfaces is most likely to reduce the amount of intermediate data transferred across


Leave a Reply

Your email address will not be published. Required fields are marked *