You have installed a cluster running HDFS and MapReduce version 2 (MRv2) on YARN.
You have no afs.hosts entry()ies in your hdfs-alte.xml configuration file. You configure a
new worker node by setting fs.default.name in its configuration files to point to the
NameNode on your cluster, and you start the DataNode daemon on that worker node. What
do you have to do on the cluster to allow the worker node to join, and start storing HDFS
Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings
at their default, what do you need to do when adding a new slave node to a cluster?
You have a 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running
HDFS High Availability (HA). You want to minimize the chance of data loss in you cluster.
What should you do?
You decide to create a cluster which runs HDFS in High Availability mode with automatic
failover, using Quorum-based Storage. What is the purpose of ZooKeeper in such a
During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper
place the intermediate data each Map task?
Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g,
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as
the network fabric. Which workloads benefit the most from a faster network fabric?
For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop’s
You are the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small
enough that it fits into a single block, which is replicated to three nodes in your cluster (with
a replication factor of 3). One of the nodes holding this file (a single block) fails. How will the
cluster handle the replication of this file in this situation/