Re: Memory barrier needed with wake_up_process()?

From: Peter Zijlstra
Date: Mon Sep 05 2016 - 04:08:48 EST


On Sat, Sep 03, 2016 at 10:16:31AM -0400, Alan Stern wrote:

> > Sorry, but that is horrible code. A barrier cannot ensure writes are
> > 'complete', at best they can ensure order between writes (or reads
> > etc..).
>
> The code is better than the comment. What I really meant was that the
> write of bh->state needs to be visible to the thread after it wakes up
> (or after it checks the wakeup condition and skips going to sleep).

Yeah, I got that.

> > Also, looking at that thing, that common->thread_wakeup_needed variable
> > is 100% redundant. All sleep_thread() invocations are inside a loop of
> > sorts and basically wait for other conditions to become true.
> >
> > For example:
> >
> > while (bh->state != BUF_STATE_EMPTY) {
> > rc = sleep_thread(common, false);
> > if (rc)
> > return rc;
> > }
> >
> > All you care about there is bh->state, _not_
> > common->thread_wakeup_needed.
>
> You know, I never went through and verified that _all_ the invocations
> of sleep_thread() are like that.

Well, thing is, they're all inside a loop which checks other conditions
for forward progress. Therefore the loop inside sleep_thread() is
pointless. Even if you were to return early, you'd simply loop in the
outer loop and go back to sleep again.

> In fact, I wrote the sleep/wakeup
> routines _before_ the rest of the code, and I didn't know in advance
> exactly how they were going to be called.

Still seems strange to me, why not use wait-queues for the first cut?

Only if you find a performance issue with wait-queues, which cannot be
fixed in the wait-queue proper, then do you do custom thingies.

Starting with a custom sleeper, just doesn't make sense to me.

> > That said, I cannot spot an obvious fail, but the code can certainly use
> > help.
>
> The problem may be that when the thread wakes up (or skips going to
> sleep), it needs to see more than just bh->state. Those other values
> it needs are not written by the same CPU that calls wakeup_thread(),
> and so to ensure that they are visible that smp_wmb() really ought to
> be smp_mb() (and correspondingly in the thread. That's what Felipe has
> been testing.

So you're saying something like:


CPU0 CPU1 CPU2

X = 1 sleep_thread()
wakeup_thread()
r = X

But how does CPU1 know to do the wakeup? That is, how are CPU0 and CPU1
coupled.