Which of the following is true?
You have a table where keys range from “A” to “Z”, and you want to scan from “D” to “H.” Which of
the following is true?
which best describes the file read process when a Client application connects into the cluster and requests a
Your cluster is running Map v1 (MRv1), with default replication set to 3, and a cluster blocks
64MB. Identify which best describes the file read process when a Client application connects into
the cluster and requests a 50MB file?
How many Mappers will run?
On a cluster running MapReduce v1 (MRv1), a MapReduce job is given a directory of 10 plain text
as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will run?
Where your application should configure the maximum number of versions to be retrieved?
From within an HBase application, you want to retrieve two versions of a row, if they exist. Where
your application should configure the maximum number of versions to be retrieved?
How would you design the schema?
You have two tables in an existing RDBMS. One table contains order information (item, quantity,
price, etc.) and the other contains store information (address, phone, manager, etc). These two
tables are not often accessed simultaneously. You would like to move this data into HBase. How
would you design the schema?
Where will the row be written?
Your client application needs to write a row to a region that has, recently split. Where will the row
be written?
Identity four pieces of cluster information that are stored on disk on the NameNode?
Identity four pieces of cluster information that are stored on disk on the NameNode?
How must you format the underlying filesystem of your Hadoop cluster’s slave nodes running on Linux?
How must you format the underlying filesystem of your Hadoop cluster’s slave nodes running on
Linux?
Which of the following sequences will it traverse to find the region serving the row range of interest?
Your client application connects to HBase for the first time to perform a write. Which of the
following sequences will it traverse to find the region serving the row range of interest?
Which of the following configuration values determines automated splitting?
Which of the following configuration values determines automated splitting?