Re: [patch 2/7] lib/hashmod: Add modulo based hash mechanism

From: Linus Torvalds
Date: Thu Apr 28 2016 - 22:25:12 EST


On Thu, Apr 28, 2016 at 4:26 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> I'll try to dig up some time to analyze the hash_long failure unless someone
> familiar with the problem is beating me to it.

I'm not sure you need to spend time analyzing failure: if you get bad
hashing with hash_long(), then we know that is a bad hash without
having to really try to figure out why.

It's the hashes that _look_ like they might be good hashes, but
there's not a lot of analysis behind it, that I would worry about. The
simple prime modulus _should_ be fine, but at the same time I kind of
suspect we can do better. Especially since it has two multiplications.

Looking around, there's

http://burtleburtle.net/bob/hash/integer.html

and that 32-bit "full avalanche" hash in six shifts looks like it
could be better. You wouldn't want to inline it, but the point of a
full avalanche bit mixing _should_ be that you could avoid the whole
"upper bits" part, and it should work independently of the target set
size.

So if that hash works better, it would be a pretty good replacement
option for hash_int().

There is also

https://gist.github.com/badboy/6267743

that has a 64 bit to 32 bit hash function that might be useful for
"hash_long()".

Most of the people who worry about hashes tend to have strings to
hash, not just a single word like a pointer, but there's clearly
people around who have tried to search for good hashes that really
spread out the bits.

Linus