Which statement best describes the ordering of these values?
In a MapReduce job, the reducer receives all values associated with same key. Which statement
best describes the ordering of these values?
Custom programmer-defined counters in MapReduce are:
Custom programmer-defined counters in MapReduce are:
What is the difference between a failed task attempt and a killed task attempt?
What is the difference between a failed task attempt and a killed task attempt?
Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access
Which project gives you a distributed, Scalable, data store that allows you random, realtime
read/write access to hundreds of terabytes of data?
Will you be able to reuse your existing Reduces as your combiner in this case and why or why not?
You want to count the number of occurrences for each unique word in the supplied input data.
You’ve decided to implement this by having your mapper tokenize each word and emit a literal
value 1, and then have your reducer increment a counter for each literal 1 it receives. After
successful implementing this, it occurs to you that you could optimize this by specifying a
combiner. Will you be able to reuse your existing Reduces as your combiner in this case and why
or why not?
MapReduce is well-suited for all of the following applications EXCEPT?
MapReduce is well-suited for all of the following applications EXCEPT? (Choose one):
What is the best way to accomplish this?
To process input key-value pairs, your mapper needs to load a 512 MB data file in memory. What
is the best way to accomplish this?
Can you use MapReduce to perform a relational join on two large tables sharing a key?
Can you use MapReduce to perform a relational join on two large tables sharing a key? Assume
that the two tables are formatted as comma-separated file in HDFS.
A combiner reduces:
A combiner reduces:
How many files will be processed by the FileInputFormat.setInputPaths () command when it’s given a path
You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt
and #data.txt. How many files will be processed by the FileInputFormat.setInputPaths () command
when it’s given a path object representing this directory?