PrepAway - Latest Free Exam Questions & Answers

which interface is most likely to reduce the amount of intermediate data transferred across the network?

You’ve written a MapReduce job that will process 500 million input records and generated 500
million key-value pairs. The data is not uniformly distributed. Your MapReduce job will create a
significant amount of intermediate data that it needs to transfer between mappers and reduces
which is a potential bottleneck. A custom implementation of which interface is most likely to reduce
the amount of intermediate data transferred across the network?

PrepAway - Latest Free Exam Questions & Answers

A.
Partitioner

B.
OutputFormat

C.
WritableComparable

D.
Writable

E.
InputFormat

F.
Combiner

Explanation:
Combiners are used to increase the efficiency of a MapReduce program. They are
used to aggregate intermediate map output locally on individual mapper outputs. Combiners can
help you reduce the amount of data that needs to be transferred across to the reducers. You can
use your reducer code as a combiner if the operation performed is commutative and associative.
Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers, What are
combiners? When should I use a combiner in my MapReduce Job?

3 Comments on “which interface is most likely to reduce the amount of intermediate data transferred across the network?

    1. Ravindra Kumar says:

      No , there is no combiner interface . A Combiner should simply be a Reducer, and thusly implement the Reducer interface




      0



      0

Leave a Reply

Your email address will not be published. Required fields are marked *