Re: Question on smp_mb__before_spinlock

From: Will Deacon
Date: Wed Sep 07 2016 - 09:51:53 EST


On Wed, Sep 07, 2016 at 03:23:54PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 07, 2016 at 10:17:26PM +1000, Nicholas Piggin wrote:
> > It seems okay, but why not make it a special sched-only function name
> > to prevent it being used in generic code?
> >
> > I would not mind seeing responsibility for the switch barrier moved to
> > generic context switch code like this (alternative for powerpc reducing
> > number of hwsync instructions was to add documentation and warnings about
> > the barriers in arch dependent and independent code). And pairing it with
> > a spinlock is reasonable.
> >
> > It may not strictly be an "smp_" style of barrier if MMIO accesses are to
> > be ordered here too, despite critical section may only be providing
> > acquire/release for cacheable memory, so maybe it's slightly more
> > complicated than just cacheable RCsc?
>
> Interesting idea..
>
> So I'm not a fan of that raw_spin_lock wrapper, since that would end up
> with a lot more boiler-plate code than just the one extra barrier.
>
> But moving MMIO/DMA/TLB etc.. barriers into this spinlock might not be a
> good idea, since those are typically fairly heavy barriers, and its
> quite common to call schedule() without ending up in switch_to().
>
> For PowerPC it works out, since there's only SYNC, no other option
> afaik.
>
> But ARM/ARM64 will have to do DSB(ISH) instead of DMB(ISH). IA64 would
> need to issue "sync.i" and mips-octeon "synciobdma".
>
> Will, any idea of the extra cost involved in DSB vs DMB?

DSB is *much* more expensive, since it completes out-of-band communication
such as MMIO accesses and TLB invalidation, as well as plain old memory
accesses.

The only reason we have DSB in our __switch_to code is to complete cache
maintenance in case the task is going to migrate to another CPU; there's
just no way to know that at the point we need to do the barrier :(

Will