Given a directory of files with the following structure: line number, tab character, string:
Example:
1abialkjfjkaoasdfjksdlkjhqweroij
2kadfjhuwqounahagtnbvaswslmnbfgy
3kjfteiomndscxeqalkzhtopedkfsikj
You want to send each line as one record to your Mapper. Which InputFormat should you use to
complete the line: conf.setInputFormat (____.class) ; ?
A.
SequenceFileAsTextInputFormat
B.
SequenceFileInputFormat
C.
KeyValueFileInputFormat
D.
BDBInputFormat
Explanation:
Note:
The output format for your first MR job should be SequenceFileOutputFormat – this will store the
Key/Values output from the reducer in a binary format, that can then be read back in, in your
second MR job using SequenceFileInputFormat.
Reference: How to parse CustomWritable from text in Hadoop
http://stackoverflow.com/questions/9721754/how-to-parse-customwritable-from-text-in-hadoop
(see answer 1 and then see the comment #1 for it)
Shouldn’t the answer be C ?
0
0
Should be C in my opinion.
0
0
option C
0
0
C
0
0
I believe question is wrong. It says file structure: line number, tab character, string, but below give examples are wrong.
If I will ignore the examples and options provided, I will go for KeyValueTextInputFormat.
0
0
B
0
0
It should be C option . Adminn kindly modify the answer
0
0
I choose B
0
0