PrepAway - Latest Free Exam Questions & Answers

What is the disadvantage of using multiple reducers with the default HashPartitioner and distributing your wor

What is the disadvantage of using multiple reducers with the default HashPartitioner and
distributing your workload across you cluster?

PrepAway - Latest Free Exam Questions & Answers

A.
You will not be able to compress the intermediate data.

B.
You will longer be able to take advantage of a Combiner.

C.
By using multiple reducers with the default HashPartitioner, output files may not be in globally
sorted order.

D.
There are no concerns with this approach. It is always advisable to use multiple reduces.

Explanation:
Multiple reducers and total ordering
If your sort job runs with multiple reducers (either because mapreduce.job.reduces in mapredsite.xml has been set to a number larger than 1, or because you’ve used the -r option to specify
the number of reducers on the command-line), then by default Hadoop will use the HashPartitioner
to distribute records across the reducers. Use of the HashPartitioner means that you can’t
concatenate your output files to create a single sorted output file. To do this you’ll need total
ordering,
Reference: Sorting text files with MapReduce

3 Comments on “What is the disadvantage of using multiple reducers with the default HashPartitioner and distributing your wor


Leave a Reply

Your email address will not be published. Required fields are marked *