Re: [PATCH] sched: Prevent raising SCHED_SOFTIRQ when CPU is !active

From: Peter Zijlstra
Date: Tue Dec 15 2020 - 10:07:11 EST


On Tue, Dec 15, 2020 at 09:34:15AM -0500, Steven Rostedt wrote:
> On Tue, 15 Dec 2020 15:23:39 +0100 (CET)
> Anna-Maria Behnsen <anna-maria@xxxxxxxxxxxxx> wrote:
>
> > > > + /*
> > > > + * Remove CPU from nohz.idle_cpus_mask to prevent participating in
> > > > + * load balancing when not active
> > > > + */
> > > > + nohz_balance_exit_idle(rq);
> > > > +
> > > > set_cpu_active(cpu, false);
> > > > /*
> > > > * We've cleared cpu_active_mask, wait for all preempt-disabled and RCU
> > >
> > > OK, so we must clear the state before !active, because getting an
> > > interrupt/softirq after would trigger the badness. And we're guaranteed
> > > nothing blocks between them to re-set it.
> >
> > As far as I understood, it is not a problem whether the delete is before or
> > after !active. When it is deleted after, the remote CPU will return in
> > kick_ilb() because cpu is not idle, because it is running the hotplug
> > thread.
>
> I was thinking that disabling it after may also cause some badness. Even if
> it does not, I think there's no harm in clearing it just before setting cpu
> active to false. And I find that the safer option.

The paranoid in me wanted to write it like:

preempt_disable();
nohz_balance_exit_idle(rq);
set_cpu_active(cpu, false);
preempt_enable();

(or possibly even local_irq_disable), to guarantee we don't hit idle
between them (which could re-set the nohz idle state we just cleared).

But then I gave up :-)