what’s the relationship between tasks and task templates?
For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship
between tasks and task templates?
What do you have to do on the cluster to allow the worker node to join, and start storing HDFS blocks?
You have installed a cluster running HDFS and MapReduce version 2 (MRv2) on YARN.
You have no afs.hosts entry()ies in your hdfs-alte.xml configuration file. You configure a
new worker node by setting fs.default.name in its configuration files to point to the
NameNode on your cluster, and you start the DataNode daemon on that worker node. What
do you have to do on the cluster to allow the worker node to join, and start storing HDFS
blocks?
What command you enter?
you need to do when adding a new slave node to a cluster?
Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings
at their default, what do you need to do when adding a new slave node to a cluster?
What should you do?
You have a 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running
HDFS High Availability (HA). You want to minimize the chance of data loss in you cluster.
What should you do?
What is the purpose of ZooKeeper in such a configuration?
You decide to create a cluster which runs HDFS in High Availability mode with automatic
failover, using Quorum-based Storage. What is the purpose of ZooKeeper in such a
configuration?
where does the Mapper place the intermediate data each Map task?
During the execution of a MapReduce v2 (MRv2) job on YARN, where does the Mapper
place the intermediate data each Map task?
Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g, memory, CPU)?
Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g,
memory, CPU)?
Which workloads benefit the most from a faster network fabric?
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as
the network fabric. Which workloads benefit the most from a faster network fabric?
Where are Hadoop’s files stored?
For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop’s
files stored?