Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g, memory, CPU)?
Which Yarn daemon or service monitors a Container’s per-application resource usage (e.g,
memory, CPU)?
Which workloads benefit the most from a faster network fabric?
You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the
network fabric. Which workloads benefit the most from a faster network fabric?
What should you do?
You want to node to only swap Hadoop daemon data from RAM to disk when absolutely
necessary. What should you do?
Where are Hadoop’s files stored?
For each YARN Job, the Hadoop framework generates task log files. Where are Hadoop’s files
stored?
Which two daemons needs to be installed on your cluster’s master nodes?
You are configuring your cluster to run HDFS and MapReducer v2 (MRv2) on YARN. Which two
daemons needs to be installed on your cluster’s master nodes?
How will the cluster handle the replication of this file in this situation/
You are the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small enough
that it fits into a single block, which is replicated to three nodes in your cluster (with a replication
factor of 3). One of the nodes holding this file (a single block) fails. How will the cluster handle the
replication of this file in this situation/
How would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
You observed that the number of spilled records from Map tasks far exceeds the number of map
output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How
would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
Which daemons need to be installed on your clusters master nodes?
You are configuring your cluster to run HDFS and MapReduce v2 (MRv2) on YARN. Which
daemons need to be installed on your clusters master nodes?
Which best describes how you determine when the last checkpoint happened?
You are running a Hadoop cluster with a NameNode on host mynamenode, a secondary
NameNode on host mysecondarynamenode and several DataNodes.
Which best describes how you determine when the last checkpoint happened?
What happens when you issue that third command?
Assume you have a file named foo.txt in your local directory. You issue the following three
commands:
Hadoop fs –mkdir input
Hadoop fs –put foo.txt input/foo.txt
Hadoop fs –put foo.txt input
What happens when you issue that third command?