You require the ability to analyze a large amount of data, which is stored on Amazon S3 using Amazon Elastic
Map Reduce. You are using the cc2 8x large Instance type, whose CPUs are mostly idle during processing.
Which of the below would be the most cost efficient way to reduce the runtime of the job?
Create more smaller flies on Amazon S3.
Add additional cc2 8x large instances by introducing a task group.
Use smaller instances that have higher aggregate I/O performance.
Create fewer, larger files on Amazon S3.