PrepAway - Latest Free Exam Questions & Answers

Which data serialization system gives you the flexibility to do this?

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25
KB. Because your Hadoop cluster isn’t optimized for storing and processing many small files you
decide to do the following actions:
1. Group the individual images into a set of larger files
2. Use the set of larger files as input for a MapReduce job that processes them directly with
Python using Hadoop streaming
Which data serialization system gives you the flexibility to do this?

PrepAway - Latest Free Exam Questions & Answers

A.
CSV

B.
XML

C.
HTML

D.
Avro

E.
Sequence Files

F.
JSON

One Comment on “Which data serialization system gives you the flexibility to do this?

  1. PasLe Choix says:

    D: sequence files are block-compressed and provide direct serialization and de-serialization of several arbitrary data types. Sequences files can be generated as the output of other MR tasks and are an efficient intermediate representation for data that is passing from one MR job to another




    0



    0

Leave a Reply

Your email address will not be published. Required fields are marked *