Re: [PATCH 1/3] sched, timer: Remove usages of ACCESS_ONCE in the scheduler

From: Ingo Molnar
Date: Thu Apr 16 2015 - 14:02:43 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Wed, Apr 15, 2015 at 09:46:01AM +0200, Ingo Molnar wrote:
>
> > @@ -2088,7 +2088,7 @@ void task_numa_fault(int last_cpupid, int mem_node, int pages, int flags)
> >
> > static void reset_ptenuma_scan(struct task_struct *p)
> > {
> > - ACCESS_ONCE(p->mm->numa_scan_seq)++;
> > + WRITE_ONCE(p->mm->numa_scan_seq, READ_ONCE(p->mm->numa_scan_seq) + 1);
>
> vs
>
> seq = ACCESS_ONCE(p->mm->numa_scan_seq);
> if (p->numa_scan_seq == seq)
> return;
> p->numa_scan_seq = seq;
>
>
> > So the original ACCESS_ONCE() barriers were misguided to begin with: I
> > think they tried to handle races with the scheduler balancing softirq
> > and tried to avoid having to use atomics for the sequence counter
> > (which would be overkill), but things like ACCESS_ONCE(x)++ never
> > guaranteed atomicity (or even coherency) of the update.
> >
> > But since in reality this is only statistical sampling code, all these
> > compiler barriers can be removed I think. Peter, Mel, Rik, do you
> > agree?
>
> ACCESS_ONCE() is not a compiler barrier

It's not a general compiler barrier (and I didn't claim so) but it is
still a compiler barrier: it's documented as a weak, variable specific
barrier in Documentation/memor-barriers.txt:

COMPILER BARRIER
----------------

The Linux kernel has an explicit compiler barrier function that prevents the
compiler from moving the memory accesses either side of it to the other side:

barrier();

This is a general barrier -- there are no read-read or write-write variants
of barrier(). However, ACCESS_ONCE() can be thought of as a weak form
for barrier() that affects only the specific accesses flagged by the
ACCESS_ONCE().

[...]

> The 'read' side uses ACCESS_ONCE() for two purposes:
> - to load the value once, we don't want the seq number to change under
> us for obvious reasons
> - to avoid load tearing and observe weird seq numbers
>
> The update side uses ACCESS_ONCE() to avoid write tearing, and
> strictly speaking it should also worry about read-tearing since its
> not hard serialized, although its very unlikely to actually have
> concurrency (IIRC).

So what bad effects can there be from the very unlikely read and write
tearing?

AFAICS nothing particularly bad. On the read side:

seq = ACCESS_ONCE(p->mm->numa_scan_seq);
if (p->numa_scan_seq == seq)
return;
p->numa_scan_seq = seq;

If p->mm->numa_scan_seq gets loaded twice (very unlikely), and two
different values happen, then we might get a 'double' NUMA placement
run - i.e. statistical noise.

On the update side:

ACCESS_ONCE(p->mm->numa_scan_seq)++;
p->mm->numa_scan_offset = 0;

If the compiler tears that up we might skip an update - again
statistical noise at worst.

Nor is compiler tearing the only theoretical worry here: in theory,
with long cache coherency latencies we might get two updates 'mixed
up' and resulting in a (single) missed update.

Only atomics would solve all the races, but I think that would be
overdoing it.

This is what I meant by that there's no harm from this race.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/