Re: NOHZ interaction between IPI-less kick_ilb() and nohz_csd_func().

From: Joel Fernandes
Date: Wed Oct 04 2023 - 12:17:10 EST


On Wed, Oct 4, 2023 at 12:09 PM Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
>
> +Frederic Weisbecker
>
> On Wed, Sep 13, 2023 at 10:32 AM Suleiman Souhlal <suleiman@xxxxxxxxxx> wrote:
> >
> > (I forgot to also add Vincent...)
> >
> > On Wed, Sep 13, 2023 at 9:49 PM Suleiman Souhlal <suleiman@xxxxxxxxxx> wrote:
> > >
> > > Hello,
> > >
> > > I noticed that on x86 machines that have MWAIT, with NOHZ, when the
> > > kernel decides to kick the idle load balance on another CPU in
> > > kick_ilb(), there's an optimization that makes it avoid using an IPI
> > > and instead exploit the fact that the remote CPU is MWAITing on the
> > > thread_info flags, by just setting TIF_NEED_RESCHED, in
> > > call_function_single_prep_ipi().
> > > However, on the remote CPU, in nohz_csd_func(), we end up not raising
> > > the sched softirq due to NEED_RESCHED being set, so the ILB doesn't
> > > end up getting done.
> > >
> > > Is this intended?

Just thinking out loud I was wondering how nohz-ILB really matters if
based on what Suleiman is saying - it is not even triggering on x86
due to the mwait optimization. And if it does matter, how much
improvement will fixing this bug give. I think at least on ARM, I
remember it matters.

I am meanwhile looking at it more closely...

thanks,

- Joel