Re: [patch 2/3] scheduler: add full memory barriers upon task switchat runqueue lock/unlock

From: Linus Torvalds
Date: Mon Feb 01 2010 - 15:53:39 EST




On Mon, 1 Feb 2010, Steven Rostedt wrote:
>
> But a race exists between the reading of the mm_cpumask and sending the
> IPI. There is in fact two different problems with this race. One is that
> a thread scheduled away, but never issued an mb(), the other is that a
> running task just came in and we never saw it.

I get it. But the thing I object to here is that Mathieu claims that we
need _two_ memory barriers in the switch_mm() code.

And I'm still not seeing it.

You claim that the rule is that "you have to do a mb on all threads", and
that there is a race if a threads switches away just as we're about to do
that.

Fine. But why _two_? And what's so magical about the mm_cpumask that it
needs to be around it?

If the rule is that we do a memory barrier as we switch an mm, then why
does that single one not just handle it? Either the CPU kept running that
mm (and the IPI will do the memory barrier), or the CPU didn't (and the
switch_mm had a memory barrier).

Without locking, I don't see how you can really have any stronger
guarantees, and as per my previous email, I don't see what the smp_mb()
around mm_cpumask accesses help - because the other CPU is still not going
to atomically "see the mask and IPI". It's going to see one value or the
other, and the smp_mb() around the access doesn't seem to have anything to
do with which value it sees.

So I can kind of understand the "We want to guarantee that switching MM's
around wants to be a memory barrier". Quite frankly, I haven't though even
that through entirely, so who knows... But the "we need to have memory
barriers on both sides of the bit setting/clearing" I don't get.

IOW, show me why that cpumask is _so_ important that the placement of the
memory barriers around it matters, to the point where you want to have it
on both sides.

Maybe you've really thought about this very deeply, but the explanations
aren't getting through to me. Educate me.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/