Re: [PATCH] futex: add FUTEX_SET_WAIT operation

From: Darren Hart
Date: Wed Nov 18 2009 - 00:41:15 EST


Michel Lespinasse wrote:

One difficulty with adaptive spinning is that we want to avoid deadlocks.
If two threads end up spinning in-kernel waiting for each other, we better
have preemption enabled... or detect and deal with the situation somehow.

This is really only a problem for SCHED_FIFO tasks right? (SCHED_OTHER should get scheduled() out when CFS deems they've exhausted their fair share). Real-Time tasks typically should be using PI anyway as adaptive locking is non-deterministic and doesn't provide for PI. So I'm not sure how critical this problem is in practice.

Also one aspect I dislike is that this would impose a given format on the
futex for storing the TID.

We do have a precedent for this with robust as well as PI futexes.

I would prefer if there were several bits available
in the futex for userspace to do whatever they want. 8 bits would likely
be enough, which leaves 24 for the TID - enough for us, but I have no idea
if that's good enough for upstream inclusion. It that's not possible,
one possible compromise could be:

And we already use two of those bits for OWNER_DIED and FUTEX_WAITERS. Perhaps you just have to choose between your own value scheme and adaptive spinning (sounds horribly limiting as I'm typing this...).


- userspace passes a TID (which it extracted from the futex value; but kernel
does not necessarily know how)
- kernel spins until that TID goes to sleep, or the futex value is not equal
to val or setval anymore
- if val != setval and the futex value is val, set it to setval
- if the futex valus is setval, block, otherwise -EWOULDBLOCK.

If the lock got stolen from a different thread, userspace can decide to
retry with or without adaptive spinning.

I'll think on this a bit more...


That would be the most generic interface I can think of, though it's
starting to be a LOT of parameters - actually, too many to pass through
the _syscall6 interface.


I also like Darren's suggestion to do a FUTEX_SET_WAIT_REQUEUE_PI,
but it's hitting the same 'too many parameters' limitation as well :/

We don't use val2 for FUTEX_WAIT_REQUEUE_PI, so we should be able to use that for setval.


--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/