Re: more on hash functions

Paul F. Dietz (dietz@interaccess.com)
Sun, 11 Apr 1999 22:52:39 -0500


Chuck Lever wrote:

[ comments on random table driven hash function ]

Memory use can be reduced by using more than two
tables. If we divide the 24 bit key into 4 6 bit
fields, the total table size is 512 bytes (assuming
each table entry is 2 bytes). This doubles the
number of loads, granted, and increases the shift/mask
work. Divided into 3 8 bit fields (that should be easy),
the memory use is 1.5K bytes.

The objection that you need more random tables
for different hash table sizes is not right.
Just mask off the irrelevant high order bits
of the hash value. For tables containing two-byte
random values, this supports hash tables of size
up to 2^16.

If the multiplicative method works better, it's
probably because the values being hashed have
nonrandomness, such as keys falling into arithmetic
progressions.

I'd be very careful about populating the entries
of the random tables -- be sure the values chosen
really are random. If the low order bit is not
random, for example, you may lose half the hash
table.

I did not find a comment in Knuth vol. 3 (1st ed.)
dismissing this idea. Is it in the 2nd ed.? What did
he say? Universal classes of hash functions had not
been discovered at the time the 1st ed. was written.

Paul

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/