Re: [PATCH 4/4] Revert "kernel/sched: Modify initial boot task idle setup"

From: Peter Zijlstra
Date: Fri Oct 20 2023 - 04:26:51 EST


On Fri, Oct 20, 2023 at 01:35:43AM +0200, Frederic Weisbecker wrote:
> Now that rcutiny can deal with early boot PF_IDLE setting, revert
> commit cff9b2332ab762b7e0586c793c431a8f2ea4db04.
>
> This fixes several subtle issues introduced on RCU-tasks(-trace):
>
> 1) RCU-tasks stalls when:
>
> 1.1 Grace period is started before init/0 had a chance to set PF_IDLE,
> keeping it stuck in the holdout list until idle ever schedules.
>
> 1.2 Grace period is started when some possible CPUs have never been
> online, keeping their idle tasks stuck in the holdout list until
> the CPU ever boots up.
>
> 1.3 Similar to 1.1 but with secondary CPUs: Grace period is started
> concurrently with secondary CPU booting, putting its idle task in
> the holdout list because PF_IDLE isn't yet observed on it. It
> stays then stuck in the holdout list until that CPU ever
> schedules. The effect is mitigated here by all the smpboot
> kthreads and the hotplug AP thread that must run to bring the
> CPU up.
>
> 2) Spurious warning on RCU task trace that assumes offline CPU's idle
> task is always PF_IDLE.
>
> More issues have been found in RCU-tasks related to PF_IDLE which should
> be fixed with later changes as those are not regressions:
>
> 3) The RCU-Tasks semantics consider the idle loop as a quiescent state,
> however:
>
> 3.1 The boot code preceding the idle entry is included in this
> quiescent state. Especially after the completion of kthreadd_done
> after which init/1 can launch userspace concurrently. The window
> is tiny before PF_IDLE is set but it exists.
>
> 3.2 Similarly, the boot code preceding the idle entry on secondary
> CPUs is wrongly accounted as RCU tasks quiescent state.
>

Urgh... so the plan is to fix RCU-tasks for all of the above to not rely
on PF_IDLE ? Because I rather like the more strict PF_IDLE and
subsequently don't much like this revert.