PrepAway - Latest Free Exam Questions & Answers

What are two possible techniques that you can use?

You are designing a solution that will use Apache HBase on Microsoft Azure HDInsight.
You need to design the row keys for the database to ensure that client traffic is directed over all of the nodes in
the cluster.
What are two possible techniques that you can use? Each correct answer presents a complete solution.
NOTE: Each correct selection is worth one point.

PrepAway - Latest Free Exam Questions & Answers

A.
padding

B.
trimming

C.
hashing

D.
salting

Explanation:
There are two strategies that you can use to avoid hotspotting:
* Hashing keys
To spread write and insert activity across the cluster, you can randomize sequentially generated keys by
hashing the keys, inverting the byte order. Note that these strategies come with trade-offs. Hashing keys, for
example, makes table scans for key subranges inefficient, since the subrange is spread across the cluster.
* Salting keys
Instead of hashing the key, you can salt the key by prepending a few bytes of the hash of the key to the actual
key.
Note. Salted Apache HBase tables with pre-split is a proven effective HBase solution to provide uniform
workload distribution across RegionServers and prevent hot spots during bulk writes. In this design, a row key
is made with a logical key plus salt at the beginning. One way of generating salt is by calculating n (number of
regions) modulo on the hash code of the logical row key (date, etc).

https://blog.cloudera.com/blog/2015/06/how-to-scan-salted-apache-hbase-tables-with-region-specific-keyranges-in-mapreduce/
http://maprdocs.mapr.com/51/MapR-DB/designing_row_keys_for_mapr_db_binary_tables.html

One Comment on “What are two possible techniques that you can use?


Leave a Reply