Which statement best describes the ordering of these values?
In a MapReduce job, the reducer receives all values associated with same key. Which statement
best describes the ordering of these values?
Which project gives you a distributed, Scalable, data store that allows you random, realtime read/write access
Which project gives you a distributed, Scalable, data store that allows you random, realtime
read/write access to hundreds of terabytes of data?
Will you be able to reuse your existing Reduces as your combiner in this case and why or why not?
You want to count the number of occurrences for each unique word in the supplied input data.
You’ve decided to implement this by having your mapper tokenize each word and emit a literal
value 1, and then have your reducer increment a counter for each literal 1 it receives. After
successful implementing this, it occurs to you that you could optimize this by specifying a
combiner. Will you be able to reuse your existing Reduces as your combiner in this case and why
or why not?
A combiner reduces:
A combiner reduces:
How many files will be processed by the FileInputFormat.setInputPaths () command when it’s given a path
You have a directory named jobdata in HDFS that contains four files: _first.txt, second.txt, .third.txt
and #data.txt. How many files will be processed by the FileInputFormat.setInputPaths () command
when it’s given a path object representing this directory?
Identify the tool best suited to import a portion of a relational database every day as files into HDFS, and g
Identify the tool best suited to import a portion of a relational database every day as files into
HDFS, and generate Java classes to interact with that imported data?
what would another user see when trying to access this life?
You use the hadoop fs –put command to write a 300 MB file using and HDFS block size of 64 MB.
Just after this command has finished writing 200 MB of this file, what would another user see
when trying to access this life?
how many map task attempts will there be?
In a MapReduce job with 500 map tasks, how many map task attempts will there be?
Determine the difference between setting the number of reduces to one and settings the number of reducers to z
You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses
TextInputFormat: the mapper applies a regular expression over input values and emits key-values
pairs with the key consisting of the matching text, and the value containing the filename and byte
offset. Determine the difference between setting the number of reduces to one and settings the
number of reducers to zero.
Table metadata in Hive is:
Table metadata in Hive is: