PrepAway - Latest Free Exam Questions & Answers

which naming scheme would give optimal performance on S3?

If an application is storing hourly log files from thousands of instances from a high traffic
web site, which naming scheme would give optimal performance on S3?

PrepAway - Latest Free Exam Questions & Answers

A.
Sequential

B.
HH-DD-MM-YYYY-log_instanceID

C.
YYYY-MM-DD-HH-log_instanceID

D.
instanceID_log-HH-DD-MM-YYYY

E.
instanceID_log-YYYY-MM-DD-HH

27 Comments on “which naming scheme would give optimal performance on S3?

  1. venkat sai says:

    Yes B is right option. The main reason is the random prefix and the performance would be higher in this case.

    A – Don’t make sense
    C – YYYY ( This would be same and would be difficult to achieve good performance)
    D & E – The instance Id would be same for the first two characters ( i-)




    5



    0
  2. BDA says:

    D , the random hostname prevents hammering a specific partition, and the HH-DD following hostname is more random than E

    B will hammer a partition once per day at HH-DD

    A changes i/o pattern, does not apply

    C is just as bad as A

    E is almost as good as D by YYYY will not be as random as D




    0



    0
  3. @dynadml says:

    I think the answer is C because it is anticipated that you will tend to search for logs based on date and time for various instances but the word log should be at the end.




    0



    0
  4. certified says:

    Anyone who understands how S3 stores data knows that B is the option if you want performance. They key thing to remember here is the more random or changing you can get the prefix to be, the more distributed your objects will be across the stack.




    0



    0
  5. PowerCram says:

    NONE of these answers is correct. In order to partition data stored on S3 the key needs to use one or more slashes (/), therefore the best way in this scenario would be to use _log/YYYY/MM/DD/HH (the order of YY, MM, DD, HH essentially doesn’t matter). This would cause the log file from each instance to be written to a different S3 partition because the instance IDs are unique, therefore they would be an effective hash key.

    The way these keys (I.E. file names) are written above they would all be written to the same partition in S3, no matter how the names are jumbled as listed. Effectively there is no difference (performance-wise) among the listed options.




    0



    0
  6. PowerCram says:

    NONE of these answers is correct. In order to partition data stored on S3 the key needs to use one or more slashes (/), therefore the best way in this scenario would be to use instanceID_log/YYYY/MM/DD/HH (the order of YY, MM, DD, HH essentially doesn’t matter). This would cause the log file from each instance to be written to a different S3 partition because the instance IDs are unique, therefore they would be an effective hash key.

    The way these keys (I.E. file names) are written above they would all be written to the same partition in S3, no matter how the names are jumbled as listed. Effectively there is no difference (performance-wise) among the listed options.

    (Had to repost because “instanceID” isn’t displayed.)




    0



    1

Leave a Reply

Your email address will not be published. Required fields are marked *