Re: POSIX mutex destruction requirements vs. futexes

From: Linus Torvalds
Date: Mon Dec 01 2014 - 13:31:37 EST


On Mon, Dec 1, 2014 at 4:05 AM, Torvald Riegel <triegel@xxxxxxxxxx> wrote:
> On Thu, 2014-11-27 at 11:38 -0800, Linus Torvalds wrote:
>>
>> > (1) Allow spurious wake-ups from FUTEX_WAIT.
>>
>> because afaik that is what we actually *do* today (we'll wake up
>> whoever re-used that location in another thread), and it's mainly
>> about the whole documentation issue. No?
>
> If that's what the kernel prefers to do, this would be fine with me.

I think it's more of a "can we even do anything else"?

The kernel doesn't even see the reuse of the futex, or the fast path
parts. Not seeing the fast path is obviously by design, and not seeing
the reuse is partly due to pthreads interfaces (I guess people should
generally call mutex_destroy, but I doubt people really do, and even
if they did, how would user space actually handle the nasty race
between "pthread_unlock -> stale futex_wake" "pthread_mutex_destroy()"
anyway?).

So the thing is, I don't see how we could possibly change the existing
FUTEX_WAKE behavior.

And introducing a new "debug mode" that explicitly adds spurious
events might as well be done in user space with a wrapper about
FUTEX_WAKE or something.

Because as far as the *kernel* is concerned, there is no "spurious
wake" event. It's not spurious. It's real and exists, and the wakeup
was really done by the user. The fact that it's really a stale wakeup
for a previous allocation of a pthread mutex data structure is
something that is completely and fundamentally invisible to the
kernel.

No?

So even from a documentation standpoint, it's really not about "kernel
interfaces" being incorrectly documented, as about a big honking
warning about internal pthread_mutex() implementation in a library,
and the impact that library choice has on allocation re-use, and how
that means that even if the FUTEX_WAKE is guaranteed to not be
spurious from a *kernel* standpoint, certain use models will create
their own spurious wake events in very subtle ways.

So I do disgree with you on the "explicitly allow spurious wake-ups",
in the sense that no, the kernel doesn't do that. But it is definitely
worth adding documentation about how users can cause their *own*
"spurious" wakeups.

So I think we continue to guarantee "no spurious wakeups" from a
kernel interface standpoint. Old programs that depend on it will
continue to work.

But at the same time, old programs that have broken libraries that
didn't realize that those libraries themselves caused "spurious"
wakeups are broken, but it's not due to some kernel issue, it's a
user-mode internal implementation (that is easy to get wrong - as
mentioned, we've had this bug ourselves inside the kernel).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/