What’s the relationship between JobTrackers and TaskTrackers?
What’s the relationship between JobTrackers and TaskTrackers?
Which of the following scenarios makes HDFS unavailable?
Which of the following scenarios makes HDFS unavailable?
Which scenario will go undetected?
You are running a Hadoop cluster with all monitoring facilities properly configured. Which scenario
will go undetected?
What is the maximum amount of virtual memory allocated for each map task before YARN will kill its Container?
Your cluster’s mapred-start.xml includes the following parameters
<name>mapreduce.map.memory.mb</name> <value>4096</value>
<name>mapreduce.reduce.memory.mb</name> <value>8192</value> And any cluster’s
yarn-site.xml includes the following parameters
<name>yarn.nodemanager.vmen-pmen-ration</name> <value>2.1</value> What is the
maximum amount of virtual memory allocated for each map task before YARN will kill its
Container?
Table schemas in Hive are:
Table schemas in Hive are:
what is the maximum number of NameNode daemons you should run on your cluster in order to avoid a “split-bra
Assuming you’re not running HDFS Federation, what is the maximum number of
NameNode daemons you should run on your cluster in order to avoid a “split-brain” scenario
with your NameNode when running HDFS High Availability (HA) using Quorum-based
storage?
Where are Hadoop task log files stored?
For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task
log files stored?
How will the Fair Scheduler handle these two jobs?
You have a cluster running with the fair Scheduler enabled. There are currently no jobs
running on the cluster, and you submit a job A, so that only job A is running on the cluster. A
while later, you submit Job B. now Job A and Job B are running on the cluster at the same
time. How will the Fair Scheduler handle these two jobs?
What should you do?
Each node in your Hadoop cluster, running YARN, has 64GB memory and 24 cores. Your
yarn.site.xml has the following configuration: <property>
<name>yarn.nodemanager.resource.memory-mb</name> <value>32768</value>
</property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name>
<value>12</value> </property> You want YARN to launch no more than 16 containers per
node. What should you do?
What should you do?
You want to node to only swap Hadoop daemon data from RAM to disk when absolutely
necessary. What should you do?