which resources could you expect to be likely bottlenecks?
You need to create a job that does frequency analysis on input data. You will do this by writing a
Mapper that uses TextInputForma and splits each value (a line of text from an input file) into
individual characters. For each one of these characters, you will emit the character as a key and
as IntWritable as the value. Since this will produce proportionally more intermediate data than
input data, which resources could you expect to be likely bottlenecks?
Which of the following is the effect on your client application?
You have one primary HMaster and one standby. Your primary HMaster Falls fails and your client
application needs to make a metadata change. Which of the following is the effect on your client
application?
Which two updates occur when a client application opens a stream to begin a file write on a cluster running Ma
Which two updates occur when a client application opens a stream to begin a file write on a cluster
running MapReduce v1 (MRv1)?
The shell command you would use to complete this is:
You have an “Employees” table in HBase. The Row Keys are the employees’ IDs. You would like
to retrieve all employees who have an employee ID between ‘user_100’ and ‘user_110’. The shell
command you would use to complete this is:
What is the outcome?
The cells in a given row have versions that range from 1000 to 2000. You execute a delete
specifying the value 3000 for the version. What is the outcome?
Which of the following actions will speed up random reading performance on your cluster?
You have an average key-value pair size of 100 bytes. Your primary access is random needs on
the table. Which of the following actions will speed up random reading performance on your
cluster?
what’s the relationship between tasks and task templates?
For a MapReduce job, on a cluster running MapReduce v1 (MRv1), what’s the relationship
between tasks and task templates?
Which MapReduce v2 (MR2/YARN) daemon is a per-machine slave responsible for launching application containers a
Which MapReduce v2 (MR2/YARN) daemon is a per-machine slave responsible for launching
application containers and monitoring application resources usage?
How does the NameNode know DataNodes are available on a cluster running MapReduce v1 (MRv1)
How does the NameNode know DataNodes are available on a cluster running MapReduce v1
(MRv1)
What happens if a Mapper on one node goes into an infinite loop while running a MapReduce job?
What happens if a Mapper on one node goes into an infinite loop while running a MapReduce job?