# Multiple-Choice Hashing

Data Structure AlgorithmsAnalysis of AlgorithmsAlgorithms

#### Big Data Hadoop

Best Seller

89 Lectures 11.5 hours

#### Practical Data Science using Python

22 Lectures 6 hours

#### Data Science and Data Analysis with Python

50 Lectures 3.5 hours

• Multiple choice hashing is named because it employs the implementation of multiple hash functions.
• On a high level, when there are multiple hash functions each item is mapped to multiple buckets and therefore the Algorithmdesigner has freedom to select in which of those the item would reside.
• It turns out that this freedom permits for Algorithms which obtain allocations that are much more balanced then that availed by implementing a single hash function.
• We will present the main Algorithmic ideas and the main mathematical tools that are implemented for proving bounds on the allocations these Algorithms produce.
• We will see that the analysis is enough powerful to withstand the variations in the basic model which in our view explains the effectiveness of these Algorithms in practical applications.

The Algorithm of Multiple Choice Hashing is explained by citing example of the balls-into-bins model

• A common framework for reasoning about load balancing processes is that of ‘balls’ and ‘bins’ where the demand (keys, processes, files etc..) are represented by 'balls' and the supply of resources (table slots, servers, storage units etc..) are represented by 'bins'.
• In this setting m no. of balls are thrown into n bins sequentially by implementing some allocation rule.
• The target is to understand the allocation of balls into bins after completion of the process, usually bounding the load (=number of balls) in the maximum loaded bin.
• According to this model assignment of balls is performed to bins by applying one or more hash functions.
• These hash functions are responsible to map a ball’s unique id (typically implicit in the model) to the set of bins, typically numbered 1...n.
• Implementing a hash function to map a bin to a ball, instead of simply drawing a bin at random, is useful in the common case where at some subsequent time, a ball’s location requires to be recovered from its id.