PrepAway - Latest Free Exam Questions & Answers

Which of the following alternatives will lower costs without compromising average performance of the system or

Your department creates regular analytics reports from your company’s log files All log data is collected in
Amazon S3 and processed by daily Amazon Elastic MapReduce (EMR) jobs that generate daily PDF reports and
aggregated tables in CSV format for an Amazon Redshift data warehouse.
Your CFO requests that you optimize the cost structure for this system.
Which of the following alternatives will lower costs without compromising average performance of the system
or data integrity for the raw data?

PrepAway - Latest Free Exam Questions & Answers

A.
Use reduced redundancy storage (RRS) for PDF and csv data in Amazon S3. Add Spot instances to Amazon
EMR jobs Use Reserved Instances for Amazon Redshift.

B.
Use reduced redundancy storage (RRS) for all data in S3. Use a combination of Spot instances and Reserved
Instances for Amazon EMR jobs use Reserved instances for Amazon Redshift.

C.
Use reduced redundancy storage (RRS) for all data in Amazon S3 Add Spot Instances to Amazon EMR jobs
Use Reserved Instances for Amazon Redshitf.

D.
Use reduced redundancy storage (RRS) for PDF and csv data in S3 Add Spot Instances to EMR jobs Use Spot
Instances for Amazon Redshift.

8 Comments on “Which of the following alternatives will lower costs without compromising average performance of the system or

  1. Sandeep says:

    My answer would be A.

    B,C – RRS S3 for ‘ALL’ data may not be recommended. If log files are lost, then they cannot be recovered. Whereas PDF and CSV can be regenerated even if lost.
    D – Spot instances for Redshift is not possible i think




    1



    0
  2. DakkuDaddy says:

    Answer is A – Agree with Sandeep

    A. Use reduced redundancy storage (RRS) for PDF and csv data in Amazon S3. Add Spot instances to Amazon
    EMR jobs Use Reserved Instances for Amazon Redshift.

    C- not possible as it is for temporary purpose
    core nodes should be reserved for the capacity that is required until your cluster completes(temporary)
    EMR uses spot instances, only AWS GovCloud (US) region does not support spot instances.

    B,c- in any case not recommended RRS all Data

    D-It is not possible as Redshift recommends reserved instances.

    Reserved Instances (a.k.a. Reserved Nodes) are appropriate for steady-state production workloads, and offer significant discounts over On-Demand pricing.

    https://aws.amazon.com/redshift

    Last but not the least its A because :

    Q: What are some EMR best practices?

    If you are running EMR in production you should specify an AMI version, Hive version, Pig version, etc. to make sure the version does not change unexpectedly (e.g. when EMR later adds support for a newer version). If your cluster is mission critical, only use Spot instances for task nodes because if the Spot price increases you may lose the instances. In development, use logging and enable debugging to spot and correct errors faster. If you are using GZIP, keep your file size to 1–2 GB because GZIP files cannot be split. Click here to download the white paper on Amazon EMR best practices.

    https://aws.amazon.com/elasticmapreduce/faqs/




    0



    0
  3. vladam says:

    A is the right answer.

    B and C are wrong because you shouldn’t use RRS for ALL data.
    D is wrong because you can’t use Spot Instances for Redshift. It is the same as RDS – Redshift is always up and running, not something you launch and terminate at any time.




    2



    0

Leave a Reply

Your email address will not be published. Required fields are marked *