How will you obtain these user records?
You have user profile records in your OLPT database, that you want to join with web logs you
have already ingested into the Hadoop file system. How will you obtain these user records?
How many keys will be passed to the Reducer’s reduce method?
You have the following key-value pairs as output from your Map task:
(the, 1)
(fox, 1)
(faster, 1)
(than, 1)
(the, 1)
(dog, 1)
How many keys will be passed to the Reducer’s reduce method?
For each input key-value pair, mappers can emit:
For each input key-value pair, mappers can emit:
Which is the best way to make this library available to your MapReducer job at runtime?
You need to perform statistical analysis in your MapReduce job and would like to call methods in
the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archive (JAR) file.
Which is the best way to make this library available to your MapReducer job at runtime?
Which InputFormat should you use to complete the line: conf.setInputFormat (____.class) ; ?
Given a directory of files with the following structure: line number, tab character, string:
Example:
1abialkjfjkaoasdfjksdlkjhqweroij
2kadfjhuwqounahagtnbvaswslmnbfgy
3kjfteiomndscxeqalkzhtopedkfsikj
You want to send each line as one record to your Mapper. Which InputFormat should you use to
complete the line: conf.setInputFormat (____.class) ; ?
how many blocks the input file occupies?
In a MapReduce job, you want each of you input files processed by a single map task. How do you
configure a MapReduce job so that a single map task processes each input file regardless of how
many blocks the input file occupies?
Which InputFormat would you use to complete the line: setInputFormat (________.class);
Given a directory of files with the following structure: line number, tab character, string:
Example:
1. abialkjfjkaoasdfjksdlkjhqweroij
2. kadf jhuwqounahagtnbvaswslmnbfgy
3. kjfteiomndscxeqalkzhtopedkfslkj
You want to send each line as one record to your Mapper. Which InputFormat would you use to
complete the line: setInputFormat (________.class);
What is a SequenceFile?
What is a SequenceFile?
All keys used for intermediate output from mappers must:
All keys used for intermediate output from mappers must:
What data does a Reducer reduce method process?
What data does a Reducer reduce method process?