Re: for_each_domain()/sched_domain_span() has offline CPUs (was Re: [PATCH 2/2] timers: Fix removed self-IPI on global timer's enqueue in nohz_full)

From: Frederic Weisbecker
Date: Thu Mar 28 2024 - 12:58:38 EST


Le Thu, Mar 28, 2024 at 03:08:08PM +0100, Valentin Schneider a écrit :
> On 27/03/24 15:28, Valentin Schneider wrote:
> > On 27/03/24 13:42, Frederic Weisbecker wrote:
> >> Le Tue, Mar 26, 2024 at 05:46:07PM +0100, Valentin Schneider a écrit :
> >>> > Then with that patch I ran TREE07, just some short iterations:
> >>> >
> >>> > tools/testing/selftests/rcutorture/bin/kvm.sh --configs "10*TREE07" --allcpus --bootargs "rcutorture.onoff_interval=200" --duration 2
> >>> >
> >>> > And the warning triggers very quickly. At least since v6.3 but maybe since
> >>> > earlier. Is this expected behaviour or am I right to assume that
> >>> > for_each_domain()/sched_domain_span() shouldn't return an offline CPU?
> >>> >
> >>>
> >>> I would very much assume an offline CPU shouldn't show up in a
> >>> sched_domain_span().
> >>>
> >>> Now, on top of the above, there's one more thing worth noting:
> >>> cpu_up_down_serialize_trainwrecks()
> >>>
> >>> This just flushes the cpuset work, so after that the sched_domain topology
> >>> should be sane. However I see it's invoked at the tail end of _cpu_down(),
> >>> IOW /after/ takedown_cpu() has run, which sounds too late. The comments
> >>> around this vs. lock ordering aren't very reassuring however, so I need to
> >>> look into this more.
> >>
> >> Ouch...
> >>
> >>>
> >>> Maybe as a "quick" test to see if this is the right culprit, you could try
> >>> that with CONFIG_CPUSET=n? Because in that case the sched_domain update is
> >>> ran within sched_cpu_deactivate().
> >>
> >> I just tried and I fear that doesn't help. It still triggers even without
> >> cpusets :-s
> >>
> >
> > What, you mean I can't always blame cgroups? What has the world come to?
> >
> > That's interesting, it means the deferred work item isn't the (only)
> > issue. I'll grab your test patch and try to reproduce on TREE07.
> >
>
> Unfortunately I haven't been able to trigger your warning with ~20 runs of
> TREE07 & CONFIG_CPUSETS=n, however it does trigger reliably with
> CONFIG_CPUSETS=y, so I'm back to thinking the cpuset work is a likely
> culprit...

Funny, I just checked again and I can still reliably reproduce with:

/tools/testing/selftests/rcutorture/bin/kvm.sh --kconfig "CONFIG_CPUSETS=n CONFIG_PROC_PID_CPUSET=n" --configs "10*TREE07" --allcpus --bootargs "rcutorture.onoff_interval=200" --duration 2

I'm thinking there might be several culprits... ;-)