How many Mappers will run?
On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a
directory of 10 plain text files as its input directory. Each file is made up of 3 HDFS blocks.
How many Mappers will run?
What should you do?
You’re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one
running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce
version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You
want to set and enforce a block size of 128MB for all new files written to the cluster after
upgrade. What should you do?
Which describes the file read process when a client application connects into the cluster and requests a 50MB
Your cluster has the following characteristics: – A rack aware topology is configured and on –
Replication is set to 3 – Cluster block size is set to 64MB Which describes the file read
process when a client application connects into the cluster and requests a 50MB file?
Can you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still have a functio
Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on YARN.
Can you configure a worker node to run a NodeManager daemon but not a DataNode
daemon and still have a functional cluster?
What should you do?
You have A 20 node Hadoop cluster, with 18 slave nodes and 2 master nodes running
HDFS High Availability (HA). You want to minimize the chance of data loss in your cluster.
What should you do?
Which scenario will go undeselected?
You are running Hadoop cluster with all monitoring facilities properly configured. Which
scenario will go undeselected?
What is the purpose of ZooKeeper in such a configuration?
You decide to create a cluster which runs HDFS in High Availability mode with automatic
failover, using Quorum Storage. What is the purpose of ZooKeeper in such a configuration?
why should you run the HDFS balancer periodically?
Choose three reasons why should you run the HDFS balancer periodically?
What occurs when you execute the command: hdfs haadmin –failover nn01 nn02?
Your cluster implements HDFS High Availability (HA). Your two NameNodes are named
nn01 and nn02. What occurs when you execute the command: hdfs haadmin –failover nn01
nn02?
you need to do in order to run Impala on the cluster and submit jobs from the command line of the gateway mach
You have a Hadoop cluster HDFS, and a gateway machine external to the cluster from
which clients submit jobs. What do you need to do in order to run Impala on the cluster and
submit jobs from the command line of the gateway machine?