Re: hashed waitqueues

From: William Lee Irwin III (
Date: Fri Jan 04 2002 - 18:48:13 EST

William Lee Irwin III wrote:
+ /*
+ * Although the default semantics of wake_up() are
+ * to wake all, here the specific function is used
+ * to make it even more explicit that a number of
+ * pages are being waited on here.
+ */
+ if(waitqueue_active(page_waitqueue(page)))
+ wake_up_all(page_waitqueue(page));

On Fri, Jan 04, 2002 at 03:07:20PM -0800, Andrew Morton wrote:
> Does the compiler CSE these two calls to page_waitqueue()?
> All versions? I'd be inclined to do CSE-by-hand here.

I'm not sure if CSE is run before or after inlining, but it
is probably not wise to leave it up to the compiler.

On Fri, Jan 04, 2002 at 03:07:20PM -0800, Andrew Morton wrote:
> Also, why wake_up_all()? That will wake all tasks which are sleeping
> in __lock_page(), even though they've asked for exclusive wakeup
> semantics. Will a bare wake_up() here not suffice?

A couple of other private responses pointed this out, and also had
suggestions for several ways to avoid the thundering herds. I am not
sure it's possible to honor exclusive wakeup requests here without
creating problems or otherwise having to propagate semantic changes
further up the call chain. I think that comment and one of the others
needs to be expanded to make the exclusive wakeup issue clearer.

Also there is a bug in the hash function (pointed out by an anonymous
private response):

On Fri, Jan 04, 2002 at 07:16:33PM -0000, an anonymous person told me:
> One *bug* in your code is that if toy have 64-bit longs, your
> GOLDEN_RATIO_PRIME isn't large enough and page_waitqueue will
> always compute hash = 0.
> The closest primes to phi * 2^64 = 11400714819323198485.95... are:
> ...
> 11400714819323198333
> 11400714819323198389
> 11400714819323198393
> *
> 11400714819323198549
> 11400714819323198581
> 11400714819323198647
> ...

which is easy enough to repair with a conditional definition of

The same person also mentioned that less work could be done in
the hash function by storing the shift amount directly in the zone,
and also pointed out that the masking operation is unnecessary as the
shift already annihilates the high-order bits.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

This archive was generated by hypermail 2b29 : Mon Jan 07 2002 - 21:00:27 EST