Re: [PATCH 1/4] exfat: Simplify exfat_utf8_d_hash() for code points above U+FFFF

From: Kohada.Tetsuhiro@xxxxxxxxxxxxxxxxxxxxxxxxxxx
Date: Thu Apr 02 2020 - 22:20:09 EST



> I guess it was designed for 8bit types, not for long (64bit types) and
> I'm not sure how effective it is even for 16bit types for which it is
> already used.

In partial_name_hash (), when 8bit value or 16bit value is specified,
upper 8-12bits tend to be 0.

> So question is, what should we do for either 21bit number (one Unicode
> code point = equivalent of UTF-32) or for sequence of 16bit numbers
> (UTF-16)?

If you want to get an unbiased hash value by specifying an 8 or 16-bit value,
the hash32() function is a good choice.
ex1: Prepare by hash32 () function.
hash = partial_name_hash (hash32 (val16,32), hash);
ex2: Use the hash32() function directly.
hash + = hash32 (val16,32);

> partial_name_hash(unsigned long c, unsigned long prevhash)
> {
> return (prevhash + (c << 4) + (c >> 4)) * 11;
> }

Another way may replace partial_name_hash().

return prevhash + hash32(c,32)