PrepAway - Latest Free Exam Questions & Answers

which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes

A client application creates an HDFS file named foo.txt with a replication factor of 3. Identify which
best describes the file access rules in HDFS if the file has a single block that is stored on data
nodes A, B and C?

PrepAway - Latest Free Exam Questions & Answers

A.
The file will be marked as corrupted if data node B fails during the creation of the file.

B.
Each data node locks the local file to prohibit concurrent readers and writers of the file.

C.
Each data node stores a copy of the file in the local file system with the same name as the
HDFS file.

D.
The file can be accessed if at least one of the data nodes storing the file is available.

Explanation:
HDFS keeps three copies of a block on three different datanodes to protect against
true data corruption. HDFS also tries to distribute these three replicas on more than one rack to
protect against data availability issues. The fact that HDFS actively monitors any failed
datanode(s) and upon failure detection immediately schedules re-replication of blocks (if needed)
implies that three copies of data on three different nodes is sufficient to avoid corrupted files.
Note:
HDFS is designed to reliably store very large files across machines in a large cluster. It stores
each file as a sequence of blocks; all blocks in a file except the last block are the same size. The
blocks of a file are replicated for fault tolerance. The block size and replication factor are
configurable per file. An application can specify the number of replicas of a file. The replication
factor can be specified at file creation time and can be changed later. Files in HDFS are write-once
and have strictly one writer at any time. The NameNode makes all decisions regarding replication
of blocks. HDFS uses rack-aware replica placement policy. In default configuration there are total
3 copies of a datablock on HDFS, 2 copies are stored on datanodes on same rack and 3rd copy
on a different rack.
Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers , How the
HDFS Blocks are replicated?

2 Comments on “which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes


Leave a Reply

Your email address will not be published. Required fields are marked *