Identify four characteristics of a 300MB file that has been written to HDFS with block size of 128MB and all o
Identify four characteristics of a 300MB file that has been written to HDFS with block size of
128MB and all other Hadoop defaults unchanged?
Identify which two daemons typically run each slave node in a Hadoop cluster running MapReduce v1 (MRv1)
Identify which two daemons typically run each slave node in a Hadoop cluster running MapReduce
v1 (MRv1)
Which three processes does HDFS High Availability (HA) enable on your cluster?
Which three processes does HDFS High Availability (HA) enable on your cluster?
you need to deploy at a minimum to store one year’s worth of data.
You are planning a Hadoop duster, and you expect to be receiving just under 1TB of data per
week which will be stored on the cluster, using Hadoop’s default replication. You decide that your
slave nodes will be configured with 4 x 1TB disks.
Calculate how many slave nodes you need to deploy at a minimum to store one year’s worth of
data.
What are two ways to determine available HDFS space in your cluster?
You are a Hadoop cluster with a NameNode on host mynamenode. What are two ways to
determine available HDFS space in your cluster?
How will the Fair’ Scheduler handle these two Jobs?
You has a cluster running with the Fail Scheduler enabled. There are currently no jobs running on
the cluster you submit a job A, so that only job A is running on the cluster. A while later, you
submit job B. Now job A and Job B are running on the cluster al the same time. How will the Fair’
Scheduler handle these two Jobs?
which best describes how the Hadoop Framework distributes block writes into HDFS from a Reducer outputting a 1
Your Hadoop cluster has 12 slave nodes, a block size set to 64MB, and a replication factor of
three.
Choose which best describes how the Hadoop Framework distributes block writes into HDFS from
a Reducer outputting a 150MB file?
Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable t
Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a
reasonable time without starving long-running jobs?
Which interface should your class implement?
You are developing a combiner that takes as input Text keys, IntWritable values, and emits Text
keys, IntWritable values. Which interface should your class implement?
which the reduce method of a given Reducer can be called?
When is the earliest point at which the reduce method of a given Reducer can be called?