Re: [PATCH 00/30] PREEMPT_AUTO: support lazy rescheduling

From: Paul E. McKenney
Date: Thu Feb 15 2024 - 14:28:55 EST


On Wed, Feb 14, 2024 at 07:45:18PM -0800, Paul E. McKenney wrote:
> On Wed, Feb 14, 2024 at 06:03:28PM -0800, Ankur Arora wrote:
> >
> > Paul E. McKenney <paulmck@xxxxxxxxxx> writes:
> >
> > > On Mon, Feb 12, 2024 at 09:55:24PM -0800, Ankur Arora wrote:
> > >> Hi,
> > >>
> > >> This series adds a new scheduling model PREEMPT_AUTO, which like
> > >> PREEMPT_DYNAMIC allows dynamic switching between a none/voluntary/full
> > >> preemption model. However, unlike PREEMPT_DYNAMIC, it doesn't depend
> > >> on explicit preemption points for the voluntary models.
> > >>
> > >> The series is based on Thomas' original proposal which he outlined
> > >> in [1], [2] and in his PoC [3].
> > >>
> > >> An earlier RFC version is at [4].
> > >
> > > This uncovered a couple of latent bugs in RCU due to its having been
> > > a good long time since anyone built a !SMP preemptible kernel with
> > > non-preemptible RCU. I have a couple of fixes queued on -rcu [1], most
> > > likely for the merge window after next, but let me know if you need
> > > them sooner.
> >
> > Thanks. As you can probably tell, I skipped out on !SMP in my testing.
> > But, the attached diff should tide me over until the fixes are in.
>
> That was indeed my guess. ;-)
>
> > > I am also seeing OOM conditions during rcutorture testing of callback
> > > flooding, but I am still looking into this.
> >
> > That's on the PREEMPT_AUTO && PREEMPT_VOLUNTARY configuration?
>
> On two of the PREEMPT_AUTO && PREEMPT_NONE configurations, but only on
> two of them thus far. I am running a longer test to see if this might
> be just luck. If not, I look to see what rcutorture scenarios TREE10
> and TRACE01 have in common.

And still TRACE01 and TREE10 are hitting OOMs, still not seeing what
sets them apart. I also hit a grace-period hang in TREE04, which does
CONFIG_PREEMPT_VOLUNTARY=y along with CONFIG_PREEMPT_AUTO=y. Something
to dig into more.

I am also getting these from builds that enable KASAN:

vmlinux.o: warning: objtool: mwait_idle+0x13: call to tif_resched.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_processor_ffh_cstate_enter+0x36: call to tif_resched.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: cpu_idle_poll.isra.0+0x18: call to tif_resched.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: acpi_safe_halt+0x0: call to tif_resched.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: poll_idle+0x33: call to tif_resched.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: default_enter_idle+0x18: call to tif_resched.constprop.0() leaves .noinstr.text section

Does tif_resched() need to be marked noinstr or some such?

Tracing got harder to disable, but I beleive that is unrelated to lazy
preemption. ;-)

Thanx, Paul