Re: [PATCH] s390/idle: Fix suspicious RCU usage

From: Sven Schnelle
Date: Thu Oct 08 2020 - 04:58:27 EST


Hi Peter,

Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Wed, Oct 07, 2020 at 12:05:51PM +0200, Peter Zijlstra wrote:
>> On Wed, Oct 07, 2020 at 09:53:25AM +0200, Sven Schnelle wrote:
>> > Hi Peter,
>> >
>> > peterz@xxxxxxxxxxxxx writes:
>> >
>> > > After commit eb1f00237aca ("lockdep,trace: Expose tracepoints") the
>> > > lock tracepoints are visible to lockdep and RCU-lockdep is finding a
>> > > bunch more RCU violations that were previously hidden.
>> > >
>> > > Switch the idle->seqcount over to using raw_write_*() to avoid the
>> > > lockdep annotation and thus the lock tracepoints.
>> > >
>> > > Reported-by: Guenter Roeck <linux@xxxxxxxxxxxx>
>> > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
>> > > [..]
>> >
>> > I'm still seeing the splat below on s390 when irq tracing is enabled:
>>
>> Damn... :/
>>
>> This one is tricky, trouble seems to be that arch_cpu_idle() is defined
>> to enable interrupts (no doubt because ot x86 :/), but we call it before
>> rcu_exit_idle().
>>
>> What a mess... let me rummage around the various archs to see what makes
>> most sense here.
>
> Maybe something like so, I've not yet tested it. I need to figure out
> how to force x86 into this path.

I've gave this patch a quick test on linux-next from today and haven't
seen the splat again. However it wasn't happening all the time, so will
test it a bit longer. I haven't looked into the tracing code in detail,
but i guess it was only happening when the lock was contented.

The only thing with this patch is that rcu complains that it gets called
with interrupts enabled on s390 when rcu_irq_enter() is called. But a
few trace_hardirqs_{on,off} at the beginning and end of the IRQ handlers
are fixing this. Will check why this worked in the past.

Sven