Re: [PATCH v3 08/19] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow

From: Ilpo Järvinen
Date: Thu Apr 27 2023 - 10:33:37 EST


On Thu, 27 Apr 2023, James Morse wrote:

> Hi Ilpo,
>
> On 21/03/2023 15:14, Ilpo Järvinen wrote:
> > On Mon, 20 Mar 2023, James Morse wrote:
> >
> >> The limbo and overflow code picks a CPU to use from the domain's list
> >> of online CPUs. Work is then scheduled on these CPUs to maintain
> >> the limbo list and any counters that may overflow.
> >>
> >> cpumask_any() may pick a CPU that is marked nohz_full, which will
> >> either penalise the work that CPU was dedicated to, or delay the
> >> processing of limbo list or counters that may overflow. Perhaps
> >> indefinitely. Delaying the overflow handling will skew the bandwidth
> >> values calculated by mba_sc, which expects to be called once a second.
> >>
> >> Add cpumask_any_housekeeping() as a replacement for cpumask_any()
> >> that prefers housekeeping CPUs. This helper will still return
> >> a nohz_full CPU if that is the only option. The CPU to use is
> >> re-evaluated each time the limbo/overflow work runs. This ensures
> >> the work will move off a nohz_full CPU once a houskeeping CPU is
> >
> > housekeeping
> >
> >> available.
>
> >> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> >> index 87545e4beb70..0b5fd5a0cda2 100644
> >> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> >> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>
> >> +/**
> >> + * cpumask_any_housekeeping() - Chose any cpu in @mask, preferring those that
> >> + * aren't marked nohz_full
> >> + * @mask: The mask to pick a CPU from.
> >> + *
> >> + * Returns a CPU in @mask. If there are houskeeping CPUs that don't use
> >> + * nohz_full, these are preferred.
> >> + */
> >> +static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask)
> >> +{
> >> + int cpu, hk_cpu;
> >> +
> >> + cpu = cpumask_any(mask);
> >> + if (tick_nohz_full_cpu(cpu)) {
> >> + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask);
> >
> > Why cpumask_nth_and() is not enough here? ..._andnot() seems to alter
> > tick_nohz_full_mask which doesn't seem desirable?
>
> tick_nohz_full_mask is the list of CPUs we should avoid. This wants to find the first cpu
> set in the domain mask, and clear in tick_nohz_full_mask.
>
> Where does cpumask_nth_andnot() modify its arguments? Its arguments are const.

Ah, it doesn't, I'm sorry about that.

I think I was trapped by ambiguous English:
* cpumask_nth_andnot - get the first cpu set in 1st cpumask, and clear in 2nd.
...which can be understood as it clearing it in 2nd.


--
i.