Re: Improved dcache name hash

Colin Plumb (colin@nyx.net)
Sat, 12 Sep 1998 23:15:46 -0600 (MDT)

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Gary Smith: "error in operate memory perocess help please"
Previous message: David S. Miller: "Re: my broken TCP is faster on broken networks"
Maybe in reply to: Colin Plumb: "Improved dcache name hash"
Next in thread: Bill Hawes: "Re: Improved dcache name hash"

> The new hash looks interesting and it would be good to see some
> comparisons of lookup times with the old and new. I recall that someone
> did some tests indicating that the current scheme wasn't too bad, though
> I'm sure it can be improved.

Actually, I just taught myself the virtue of profiling.

> Putting the parent pointer into the hash in the beginning would likely
> be a problem, as then you'd need to recompute hashes when the parents
> change.

The parents *don't* change. If you look at the definition of the
fs-specific d_hash function (in dentry_operations, as opposed to the generic
d_hash function in dcache.c which doesn't hash names...), the hash is defined
to (possibly) depend on the parent dentry.

If the file gets moved, it might also change names, so the name gets rehashed
regardless. (The code is "find dentry for source; find dentry for dest;
check for legality; shuffle inodes")

I checked, and a parent dentry is always known when the name is hashed,
and the parent never changes.

> Probably a good starting point for comparison would be to gather some
> stats on the evenness of the current hash chains, and then see what the
> new algorithm can do.

I did. Let's just say that the result weren't stellar. I can't believe that
incredibly bad algorithm is as good as it is, but the code is appended if
anybody wants to try.

It computes the number of dentries that must be traversed to look up each
filename in the directories passed as command-line arguments. (Instead of
the parent dentry point, it uses the parent inode number, which should
have a similar effect.) This is essentially the sum of the squares of
the hash chain lengths, so the more uniform, the lower the number.

For reference, it also prints the best possible value, if everything
were perfrctly uniform (all chains have either n/1024 or n/1024+1 entries
on them).

The three options compared are:
1) The existing Linux hash
2) The alternate hash I proposed, with the inode number added in at the end
3) The alternate scheme, with the inode number added in at the beginning

For a large single directory (a porn site's .jpg directory), the
Total cost for 6190 names:
Cost 1: 24802
Cost 2: 24848
Cost 3: 24751
Minimum possible cost: 21826

For a full (.o files and all) linux kernel tree (2.1.122-pre-2, I think),
the figures are:
Total cost for 5811 names:
Cost 1: 22360
Cost 2: 22223
Cost 3: 45913
Minimum possible cost: 19506

And my home directory, which is way too cluttered:
Total cost for 10984 names:
Cost 1: 70778
Cost 2: 69816
Cost 3: 572740
Minimum possible cost: 64504

Let's just say that it appears that not too random is *good*, as we are
getting more uniformity than truly random would provide.

So much for ingenuity. Hash #2 does slightly better, but hardly
significantly better. I don't like the existing hash particularly because
4 divides 32, so the 1st and 9th bytes are treated identically. But
maybe, even if I think it looks wrong, it ain't broke.

-- 
	-Colin

The test code.  "find DIR -type d -print0 | xargs -0 ./hashtest" to test it.

#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>	/* For readdir() */
#include <unistd.h>	/* For stat() */
#include <assert.h>

typedef unsigned word32;
typedef word32 hash_t;

#define HASH_BITS 10
#define HASH_SIZE ((hash_t)1<<HASH_BITS)
#define HASH_MASK (HASH_SIZE - 1)

/*
 * The original Linux dcache hash...
 * prevhash = (prevhash << 4) | (prevhash >> (8*sizeof(unsigned long)-4));
 * return prevhash ^ c;
 */

#define HASH1_INIT(hash,ino)	((hash)=0)
#define HASH1_UPDATE(hash,c) ((hash) = ((hash) << 4 | (hash) >> 28) ^ c)
#define HASH1_FINISH(hash, ino) ((hash)+=(ino),(hash)^=(hash)>>HASH_BITS^(hash)>>2*HASH_BITS)

/*
 * The proposed new hash.
 *
 * word32 one_at_a_time(unsigned char const *key, size_t len)
 * {
 *	word32 hash = 0x9e3779b9; // Or any arbitrary value
 *	size_t i;
 * 
 *	for (i=0; i < len; ++i) {
 *		hash += key[i];
 *		hash += (hash << 10);
 *		hash ^= (hash >> 6);
 *	}
 *	hash += (hash << 3);
 *	hash ^= (hash >> 11);
 *	hash += (hash << 15);
 * 
 *	return hash;
 * }
 */

#define HASH2_UPDATE(hash,c) ((hash)+=(c),(hash)+=(hash)<<10,(hash)^=(hash)>>6)
#define HASH3_UPDATE HASH2_UPDATE

#define HASH2_INIT(hash,ino) ((hash)=0x9e3779b9)
#define HASH3_INIT(hash,ino) (HASH2_INIT(hash,ino),HASH2_UPDATE(hash2,ino))

#define HASH3_FINISH(hash,ino) ((hash)+=(hash)<<3,(hash)^=(hash)>>11,(hash)+=(hash)<<15)
#define HASH2_FINISH(hash,ino) (HASH2_UPDATE(hash,ino),HASH3_FINISH(hash,ino))

static unsigned hashes1[1<<HASH_BITS];
static unsigned hashes2[1<<HASH_BITS];
static unsigned hashes3[1<<HASH_BITS];

int
hashdir(DIR *dir)
{
	hash_t ino;
	hash_t hash1, hash2, hash3;
	struct dirent *d;
	char *p;
	unsigned char c;
	

	d = readdir(dir);
	assert(d);
	assert(d->d_name[0] == '.');
	assert(d->d_name[1] == 0);
	ino = d->d_ino;

	d = readdir(dir);
	assert(d);
	assert(d->d_name[0] == '.');
	assert(d->d_name[1] == '.');
	assert(d->d_name[2] == 0);

	while ((d = readdir(dir))) {
		HASH1_INIT(hash1,ino);
		HASH2_INIT(hash2,ino);
		HASH3_INIT(hash3,ino);

		for (p = d->d_name; (c = *p) != 0; p++) {
			HASH1_UPDATE(hash1, c);
			HASH2_UPDATE(hash2, c);
			HASH3_UPDATE(hash3, c);
		}
		HASH1_FINISH(hash1,ino);
		HASH2_FINISH(hash2,ino);
		HASH3_FINISH(hash3,ino);

		hashes1[hash1 & HASH_MASK]++;
		hashes2[hash2 & HASH_MASK]++;
		hashes3[hash3 & HASH_MASK]++;
	}
	return 0;
}

int
main(int argc, char **argv)
{
	DIR *dir;
	unsigned t;
	unsigned total;
	unsigned long cost1, cost2, cost3;
	unsigned long cost0;
	int i;
	
	while (*++argv) {
		dir = opendir(*argv);
		if (!dir) {
			perror(*argv);
			return 1;
		}
		hashdir(dir);
		closedir(dir);
	}
	/*
	 * Now a figure of merit is the average lookup time.
	 * For a list of length n, this is n*(n+1)/2.
	 * Add up the time for all the lists.
	 */
	cost1 = cost2 = cost3 = 0;
	total = 0;

	for (i = 0; i < HASH_SIZE; i++) {
		t = hashes1[i];
		cost1 += t*(t+1)/2;
		t = hashes2[i];
		cost2 += t*(t+1)/2;
		t = hashes3[i];
		cost3 += t*(t+1)/2;
		total += t;
	}
	printf("Total cost for %u names:\n", total);
	printf("Cost 1: %lu\n", cost1);
	printf("Cost 2: %lu\n", cost2);
	printf("Cost 3: %lu\n", cost3);

	/* Compute minimum possible cost. */
	t = total/HASH_SIZE;
	cost0 = HASH_SIZE * t*(t+1)/2;
	/*
	 * total % HASH_SIZE chains have t+1 entries.
	 * (t+1)*(t+2)/2 - t*(t+1)/2 = 2*(t+1)/2 = t+1.
	 */
	cost0 += (total % HASH_SIZE) * (t+1);
	printf("Minimum possible cost: %lu\n", cost0);

	return 0;
}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html

Next message: Gary Smith: "error in operate memory perocess help please"
Previous message: David S. Miller: "Re: my broken TCP is faster on broken networks"
Maybe in reply to: Colin Plumb: "Improved dcache name hash"
Next in thread: Bill Hawes: "Re: Improved dcache name hash"