You’ve written a MapReduce job that will process 500 million input records and generated 500
million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a
significant amount of intermediate data that it needs to transfer between mappers and reduces
which is a potential bottleneck. A custom implementation of which interface is most likely to reduce
the amount of intermediate data transferred across the network?

A.
Partitioner
B.
OutputFormat
C.
WritableComparable
D.
Writable
E.
InputFormat
F.
Combiner
Explanation:
Combiners are used to increase the efficiency of a MapReduce program. They are
used to aggregate intermediate map output locally on individual mapper outputs. Combiners can
help you reduce the amount of data that needs to be transferred across to the reducers. You can
use your reducer code as a combiner if the operation performed is commutative and associative.
F.
Combiner
0
0
I choose F
0
0