Re: [PATCH] rcutorture: Avoid problematic critical section nesting on RT

From: Paul E. McKenney
Date: Thu Aug 19 2021 - 14:20:37 EST


On Thu, Aug 19, 2021 at 05:47:08PM +0200, Sebastian Andrzej Siewior wrote:
> On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> > up with following which I can explain:
> >
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index 40ef5417d9545..5c8b31b7eff03 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -1432,28 +1432,34 @@ static void rcutorture_one_extend(int *readstate, int newstate,
> > /* First, put new protection in place to avoid critical-section gap. */
> > if (statesnew & RCUTORTURE_RDR_BH)
> > local_bh_disable();
> > + if (statesnew & RCUTORTURE_RDR_RBH)
> > + rcu_read_lock_bh();
> > if (statesnew & RCUTORTURE_RDR_IRQ)
> > local_irq_disable();
> > if (statesnew & RCUTORTURE_RDR_PREEMPT)
> > preempt_disable();
> > - if (statesnew & RCUTORTURE_RDR_RBH)
> > - rcu_read_lock_bh();
> > if (statesnew & RCUTORTURE_RDR_SCHED)
> > rcu_read_lock_sched();
> > if (statesnew & RCUTORTURE_RDR_RCU)
> > idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;
>
> So the ordering in the enable and disable part regarding BH is
> important. First BH, then preemption or IRQ.
>
> > - /* Next, remove old protection, irq first due to bh conflict. */
> > + /*
> > + * Next, remove old protection, in decreasing order of strength
> > + * to avoid unlock paths that aren't safe in the stronger
> > + * context. Namely: BH can not be enabled with disabled interrupts.
> > + * Additionally PREEMPT_RT requires that BH is enabled in preemptible
> > + * context.
> > + */
> > if (statesold & RCUTORTURE_RDR_IRQ)
> > local_irq_enable();
> > - if (statesold & RCUTORTURE_RDR_BH)
> > - local_bh_enable();
> > if (statesold & RCUTORTURE_RDR_PREEMPT)
> > preempt_enable();
> > - if (statesold & RCUTORTURE_RDR_RBH)
> > - rcu_read_unlock_bh();
> > if (statesold & RCUTORTURE_RDR_SCHED)
> > rcu_read_unlock_sched();
> > + if (statesold & RCUTORTURE_RDR_BH)
> > + local_bh_enable();
> > + if (statesold & RCUTORTURE_RDR_RBH)
> > + rcu_read_unlock_bh();
> > if (statesold & RCUTORTURE_RDR_RCU) {
> > bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
>
> The same in the unlock part so that BH is unlocked in preemptible
> context.
> Now if you need bh lock/unlock in atomic context (either with disabled
> IRQs or preemption) then I would dig out the atomic-bh part again and
> make !RT only without the preempt_disable() section around about which
> one you did complain.
>
> > @@ -1496,6 +1502,9 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> > int mask = rcutorture_extend_mask_max();
> > unsigned long randmask1 = torture_random(trsp) >> 8;
> > unsigned long randmask2 = randmask1 >> 3;
> > + unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> > + unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> > + unsigned long bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> >
> > WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
> > /* Mostly only one bit (need preemption!), sometimes lots of bits. */
> > @@ -1503,11 +1512,37 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> > mask = mask & randmask2;
> > else
> > mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> > - /* Can't enable bh w/irq disabled. */
> > - if ((mask & RCUTORTURE_RDR_IRQ) &&
> > - ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> > - (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> > - mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> > +
> > + /*
> > + * Can't enable bh w/irq disabled.
> > + */
> > + if (mask & RCUTORTURE_RDR_IRQ)
> > + mask |= oldmask & bhs;
> > +
> > + /*
> > + * Ideally these sequences would be detected in debug builds
> > + * (regardless of RT), but until then don't stop testing
> > + * them on non-RT.
> > + */
> > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> > + /*
> > + * Can't release the outermost rcu lock in an irq disabled
> > + * section without preemption also being disabled, if irqs
> > + * had ever been enabled during this RCU critical section
> > + * (could leak a special flag and delay reporting the qs).
> > + */
> > + if ((oldmask & RCUTORTURE_RDR_RCU) &&
> > + (mask & RCUTORTURE_RDR_IRQ) &&
> > + !(mask & preempts))
> > + mask |= RCUTORTURE_RDR_RCU;
>
> This piece above, I don't understand. I had it running for a while and
> it didn't explode. Let me try TREE01 for 30min without that piece.

This might be historical. There was a time when interrupts being
disabled across rcu_read_unlock() meant that preemption had to have
been disabled across the entire RCU read-side critical section.

I am not seeing a purpose for it now, but I could easily be missing
something, especially given my tenuous grasp of RT.

Either way, looking forward to the next version!

Thanx, Paul

> > + /* Can't modify bh in atomic context */
> > + if (oldmask & preempts_irq)
> > + mask &= ~bhs;
> > + if ((oldmask | mask) & preempts_irq)
> > + mask |= oldmask & bhs;
>
> And this is needed because we can't lock/unlock bh while atomic.
>
> > + }
> > +
> > return mask ?: RCUTORTURE_RDR_RCU;
> > }
> >
>
> Sebastian