PrepAway - Latest Free Exam Questions & Answers

What is the storage capacity of your Hadoop cluster (assuming no compression)?

Your cluster has 10 DataNodes, each with a single 1 TB hard drive. You utilize all your disk
capacity for HDFS, reserving none for MapReduce. You implement default replication settings.
What is the storage capacity of your Hadoop cluster (assuming no compression)?

PrepAway - Latest Free Exam Questions & Answers

A.
about 3 TB

B.
about 5 TB

C.
about 10 TB

D.
about 11 TB

Explanation:
In default configuration there are total 3 copies of a datablock on HDFS, 2 copies
are stored on datanodes on same rack and 3rd copy on a different rack.
Note:HDFS is designed to reliably store very large files across machines in a large cluster. It
stores each file as a sequence of blocks; all blocks in a file except the last block are the same
size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are
configurable per file. An application can specify the number of replicas of a file. The replication

factor can be specified at file creation time and can be changed later. Files in HDFS are write-once
and have strictly one writer at any time. The NameNode makes all decisions regarding replication
of blocks. HDFS uses rack-aware replica placement policy.
Reference:24 Interview Questions & Answers for Hadoop MapReduce developers,How the HDFS
Blocks are replicated?

7 Comments on “What is the storage capacity of your Hadoop cluster (assuming no compression)?

  1. Vinod says:

    C

    For example assume the block size is 1TB.
    If we place 1TB file, then 3 nodes are full, if u place 2nd 1TB file then another 3 nodes are full. And placing 3rd 1TB file will fill another 3 nodes.
    After this even to copy a 1kb file will not replicate.




    0



    0
  2. soumya saha says:

    the ques is not clear .. it should be .. how much data we can put into that system. it is around 3 TB as again the data has to be replicated by three times.




    0



    0
  3. Amit says:

    The answer should be C. about 10 TB.
    The question is what is the storage capacity of the Hadoop CLUSTER when CLUSTER is having 10 nodes each with 1 TB.
    It is not asking about the TOTAL file storage capacity.




    0



    0

Leave a Reply

Your email address will not be published. Required fields are marked *