how many key-value pairs will there be in each file?

seenagapeNovember 5, 2016

If you run the word count MapReduce program with m mappers and r reducers, how many output
files will you get at the end of the job? And how many key-value pairs will there be in each file?
Assume k is the number of unique words in the input files.

PrepAway - Latest Free Exam Questions & Answers

A.
There will be r files, each with exactly k/r key-value pairs.

B.
There will be r files, each with approximately k/m key-value pairs.

C.
There will be r files, each with approximately k/r key-value pairs.

D.
There will be m files, each with exactly k/m key value pairs.

E.
There will be m files, each with approximately k/m key-value pairs.

Explanation:
Note:
*A MapReduce job withm mappers and r reducers involves up to m*r distinct copy operations,
since eachmapper may have intermediate output going to every reducer.
*In the canonical example of word counting, a key-value pair is emitted for every word found. For
example, if we had 1,000 words, then 1,000 key-value pairs will be emitted from the mappers to
the reducer(s).

6 Comments on “how many key-value pairs will there be in each file?”

Vinod says:

October 21, 2014 at 12:27 am

yogeswaran says:

January 18, 2015 at 5:07 am

Jack says:

March 19, 2015 at 7:21 pm

answer is C, check

http://www.quora.com/If-you-run-the-word-count-MapReduce-program-with-m-mappers-and-r-reducers-how-many-output-files-will-you-get-at-the-end-of-the-job-And-how-many-key-value-pairs-will-there-be-in-each-file-Assume-k-is-the-number-of-unique-words-in-the-input-files

roja says:

April 9, 2015 at 5:14 pm

C. No guarantee that the no. of keys is always divisible by r, so the last file will have a bit less than the others.

Ramesh Hiremath says:

July 6, 2015 at 1:17 pm

C.
There will be r files, each with approximately k/r key-value pairs.

Explanation: The word count job emits each unique word once with the count of the number of occurences of that word. There will therefore be k total words in the output. As the job is executing with r reduce tasks, there will be r output files, one for each mapper.
The word keys are distributed more or less evenly among the reducers, so each output file will contian roughly k/r words. Note that the number of map tasks is irrelevant, as the intermediate output from all map tasks is combined together as part of the shuffle
phase.

networkmanagers says:

November 6, 2016 at 12:55 am

I agree with the answer. A

Vinod says:

October 21, 2014 at 12:27 am

A

0

0

yogeswaran says:

January 18, 2015 at 5:07 am

C

0

0

Jack says:

March 19, 2015 at 7:21 pm

answer is C, check

http://www.quora.com/If-you-run-the-word-count-MapReduce-program-with-m-mappers-and-r-reducers-how-many-output-files-will-you-get-at-the-end-of-the-job-And-how-many-key-value-pairs-will-there-be-in-each-file-Assume-k-is-the-number-of-unique-words-in-the-input-files

0

0

roja says:

April 9, 2015 at 5:14 pm

C. No guarantee that the no. of keys is always divisible by r, so the last file will have a bit less than the others.

0

0

Ramesh Hiremath says:

July 6, 2015 at 1:17 pm

C.
There will be r files, each with approximately k/r key-value pairs.

Explanation: The word count job emits each unique word once with the count of the number of occurences of that word. There will therefore be k total words in the output. As the job is executing with r reduce tasks, there will be r output files, one for each mapper.
The word keys are distributed more or less evenly among the reducers, so each output file will contian roughly k/r words. Note that the number of map tasks is irrelevant, as the intermediate output from all map tasks is combined together as part of the shuffle
phase.

0

0

networkmanagers says:

November 6, 2016 at 12:55 am

I agree with the answer. A

0

0

Get 50% Discount on All Your Purchases
at PrepAway.com - Latest Exam Questions

This is ONE TIME OFFER

Enter your email address to receive your 50% off dicount code:

SPECIAL OFFER: GET 50% OFF

Use Discount Code:

Briefing Cloudera Knowledge

Free Cloudera Study Guide

how many key-value pairs will there be in each file?

6 Comments on “how many key-value pairs will there be in each file?”

Leave a Reply Cancel reply