Re: Patch "asm-generic/bitops/lock.h: Rewrite using atomic_fetch_" causes kernel crash

From: Peter Zijlstra
Date: Thu Aug 30 2018 - 16:45:25 EST


On Thu, Aug 30, 2018 at 08:31:59PM +0000, Vineet Gupta wrote:
> On 08/30/2018 07:29 AM, Peter Zijlstra wrote:
> > On Thu, Aug 30, 2018 at 03:23:55PM +0100, Will Deacon wrote:
> >
> >> Yes, that would be worth trying. However, I also just noticed that the
> >> fetch-ops (which are now used to implement test_and_set_bit_lock()) seem
> >> to be missing the backwards branch in the LL/SC case. Yet another diff
> >> below.
> >>
> >> Will
> >>
> >> --->8
> >>
> >> diff --git a/arch/arc/include/asm/atomic.h b/arch/arc/include/asm/atomic.h
> >> index 4e0072730241..f06c5ed672b3 100644
> >> --- a/arch/arc/include/asm/atomic.h
> >> +++ b/arch/arc/include/asm/atomic.h
> >> @@ -84,7 +84,7 @@ static inline int atomic_fetch_##op(int i, atomic_t *v) \
> >> "1: llock %[orig], [%[ctr]] \n" \
> >> " " #asm_op " %[val], %[orig], %[i] \n" \
> >> " scond %[val], [%[ctr]] \n" \
> >> - " \n" \
> >> + " bnz 1b \n" \
> >> : [val] "=&r" (val), \
> >> [orig] "=&r" (orig) \
> >> : [ctr] "r" (&v->counter), \
> > ACK!! sorry about that, no idea how I messed that up.
> >
> > Also, once it all works, they should look at switching to _relaxed
> > atomics for LL/SC.
>
> Indeed this is the mother of all issues, I tried and system is clearly hosed with
> and works after.
> What's amazing is the commit 4aef66c8ae9 which introduced it is from 2016 ;-)
> Back then we had a retry branch with backoff stuff which I'd reverted for new
> cores and the merge conflict somehow missed it.
>
> @PeterZ I'll create a patch with you as author ? do I need any formal sign offs,
> acks etc ?

Well, Will spotted it, give authorship to him, you have my ack per the
above.

> So after this there are 2 other things to be addresses / looked at still while we
> are still here.
>
> 1. After 84c6591103db __clear_bit_lock() implementation will be broken (or atleast
> not consistent with what we had after), do we need to reinstate it.
> 2. Will's proposed change to remove the underlying issue, but the issue in #1
> remains ?

No, like explained, for spinlock based atomics the issue _should_ not
exist, and if you look at your atomic_set() implementation for that
variant, you'll see it does the right thing by taking the lock.

Basically atomic_set() for spinlock based atomics ends up being
(void)atomic_xchg().

FWIW, also ACK on Will's patch to switch you over to asm-generic bitops
entirely.