Hashing Tutorial

Section 2.1 - Simple Mod Function

Consider the following hash function used to hash integers to a table of sixteen slots:

int h(int x) {
  return x % 16;
}

Note that "%" is the symbol for the mod function.

You should try out this hash function. Insert several values into the table. For each value, before you insert it, try to predict where it will be stored in the table. If you use "Instruction Mode," key values that you enter will be inserted into the table. If you switch to "Test Mode," then you must first indicate which slot in the table will take the next key before the visualization will show you the answer.

Recall that the values 0 to 15 can be represented with four bits (i.e., 0000 to 1111). The value returned by this hash function depends solely on the least significant four bits of the key. Because these bits are likely to be poorly distributed (as an example, a high percentage of the keys might be even numbers, which means that the low order bit is zero), the result will also be poorly distributed. This example shows that the size of the table M can have a big effect on the performance of a hash system because the table size is typically used as the modulus to ensure that the hash function produces a number in the range 0 to M-1\).

The following applet will show you performance information about this hash function. Try with different table sizes, different levels of loading (the number of keys inserted), and different input distributions. In particular, see what happens when the hash table size is a power of 2 and an inappropriate key distribution is used, like the "Lower 8" distribution. You can contrast this with the uniform distribution.