PrepAway - Latest Free Exam Questions & Answers

How must you format the underlying filesystem of your Hadoop cluster’s slave nodes running on Linux?

How must you format the underlying filesystem of your Hadoop cluster’s slave nodes running on
Linux?

PrepAway - Latest Free Exam Questions & Answers

A.
They may be formatted in nay Linux filesystem

B.
They must be formatted as HDFS

C.
They must be formatted as either ext3 or ext4

D.
They must not be formatted – – HDFS will format the filesystem automatically

Explanation:
The Hadoop Distributed File System is platform independent and can function on
top of any underlying file system and Operating System. Linux offers a variety of file system
choices, each with caveats that have an impact on HDFS.
As a general best practice, if you are mounting disks solely for Hadoop data, disable ‘noatime’.
This speeds up reads for files.
There are three Linux file system options that are popular to choose from:
Ext3
Ext4
XFS
Yahoo uses the ext3 file system for its Hadoop deployments. ext3 is also the default filesystem
choice for many popular Linux OS flavours. Since HDFS on ext3 has been publicly tested on
Yahoo’s cluster it makes for a safe choice for the underlying file system.
ext4 is the successor to ext3. ext4 has better performance with large files. ext4 also introduced
delayed allocation of data, which adds a bit more risk with unplanned server outages while
decreasing fragmentation and improving performance.
XFS offers better disk space utilization than ext3 and has much quicker disk formatting times than
ext3. This means that it is quicker to get started with a data node using XFS.
Reference:
Hortonworks, Linux File Systems for HDFS

One Comment on “How must you format the underlying filesystem of your Hadoop cluster’s slave nodes running on Linux?


Leave a Reply

Your email address will not be published. Required fields are marked *