Re: [PATCH] rcutorture: Avoid problematic critical section nesting on RT

From: Sebastian Andrzej Siewior
Date: Thu Aug 19 2021 - 11:47:14 EST


On 2021-08-19 17:39:29 [+0200], To Paul E. McKenney wrote:
> up with following which I can explain:
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index 40ef5417d9545..5c8b31b7eff03 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -1432,28 +1432,34 @@ static void rcutorture_one_extend(int *readstate, int newstate,
> /* First, put new protection in place to avoid critical-section gap. */
> if (statesnew & RCUTORTURE_RDR_BH)
> local_bh_disable();
> + if (statesnew & RCUTORTURE_RDR_RBH)
> + rcu_read_lock_bh();
> if (statesnew & RCUTORTURE_RDR_IRQ)
> local_irq_disable();
> if (statesnew & RCUTORTURE_RDR_PREEMPT)
> preempt_disable();
> - if (statesnew & RCUTORTURE_RDR_RBH)
> - rcu_read_lock_bh();
> if (statesnew & RCUTORTURE_RDR_SCHED)
> rcu_read_lock_sched();
> if (statesnew & RCUTORTURE_RDR_RCU)
> idxnew = cur_ops->readlock() << RCUTORTURE_RDR_SHIFT;

So the ordering in the enable and disable part regarding BH is
important. First BH, then preemption or IRQ.

> - /* Next, remove old protection, irq first due to bh conflict. */
> + /*
> + * Next, remove old protection, in decreasing order of strength
> + * to avoid unlock paths that aren't safe in the stronger
> + * context. Namely: BH can not be enabled with disabled interrupts.
> + * Additionally PREEMPT_RT requires that BH is enabled in preemptible
> + * context.
> + */
> if (statesold & RCUTORTURE_RDR_IRQ)
> local_irq_enable();
> - if (statesold & RCUTORTURE_RDR_BH)
> - local_bh_enable();
> if (statesold & RCUTORTURE_RDR_PREEMPT)
> preempt_enable();
> - if (statesold & RCUTORTURE_RDR_RBH)
> - rcu_read_unlock_bh();
> if (statesold & RCUTORTURE_RDR_SCHED)
> rcu_read_unlock_sched();
> + if (statesold & RCUTORTURE_RDR_BH)
> + local_bh_enable();
> + if (statesold & RCUTORTURE_RDR_RBH)
> + rcu_read_unlock_bh();
> if (statesold & RCUTORTURE_RDR_RCU) {
> bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);

The same in the unlock part so that BH is unlocked in preemptible
context.
Now if you need bh lock/unlock in atomic context (either with disabled
IRQs or preemption) then I would dig out the atomic-bh part again and
make !RT only without the preempt_disable() section around about which
one you did complain.

> @@ -1496,6 +1502,9 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> int mask = rcutorture_extend_mask_max();
> unsigned long randmask1 = torture_random(trsp) >> 8;
> unsigned long randmask2 = randmask1 >> 3;
> + unsigned long preempts = RCUTORTURE_RDR_PREEMPT | RCUTORTURE_RDR_SCHED;
> + unsigned long preempts_irq = preempts | RCUTORTURE_RDR_IRQ;
> + unsigned long bhs = RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
>
> WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
> /* Mostly only one bit (need preemption!), sometimes lots of bits. */
> @@ -1503,11 +1512,37 @@ rcutorture_extend_mask(int oldmask, struct torture_random_state *trsp)
> mask = mask & randmask2;
> else
> mask = mask & (1 << (randmask2 % RCUTORTURE_RDR_NBITS));
> - /* Can't enable bh w/irq disabled. */
> - if ((mask & RCUTORTURE_RDR_IRQ) &&
> - ((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
> - (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH))))
> - mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
> +
> + /*
> + * Can't enable bh w/irq disabled.
> + */
> + if (mask & RCUTORTURE_RDR_IRQ)
> + mask |= oldmask & bhs;
> +
> + /*
> + * Ideally these sequences would be detected in debug builds
> + * (regardless of RT), but until then don't stop testing
> + * them on non-RT.
> + */
> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> + /*
> + * Can't release the outermost rcu lock in an irq disabled
> + * section without preemption also being disabled, if irqs
> + * had ever been enabled during this RCU critical section
> + * (could leak a special flag and delay reporting the qs).
> + */
> + if ((oldmask & RCUTORTURE_RDR_RCU) &&
> + (mask & RCUTORTURE_RDR_IRQ) &&
> + !(mask & preempts))
> + mask |= RCUTORTURE_RDR_RCU;

This piece above, I don't understand. I had it running for a while and
it didn't explode. Let me try TREE01 for 30min without that piece.

> + /* Can't modify bh in atomic context */
> + if (oldmask & preempts_irq)
> + mask &= ~bhs;
> + if ((oldmask | mask) & preempts_irq)
> + mask |= oldmask & bhs;

And this is needed because we can't lock/unlock bh while atomic.

> + }
> +
> return mask ?: RCUTORTURE_RDR_RCU;
> }
>

Sebastian