Re: [PATCH v7 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu

From: James Morse
Date: Fri Dec 15 2023 - 12:41:58 EST


Hi Reinette,

On 12/14/23 18:53, Reinette Chatre wrote:
On 12/14/2023 3:38 AM, James Morse wrote:
On 09/11/2023 17:48, Reinette Chatre wrote:
On 10/25/2023 11:03 AM, James Morse wrote:

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c4c1e1909058..f5fff2f0d866 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -61,19 +61,36 @@
* cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
* aren't marked nohz_full
* @mask: The mask to pick a CPU from.
+ * @exclude_cpu:The CPU to avoid picking.
*
- * Returns a CPU in @mask. If there are housekeeping CPUs that don't use
- * nohz_full, these are preferred.
+ * Returns a CPU from @mask, but not @exclude_cpu. If there are housekeeping
+ * CPUs that don't use nohz_full, these are preferred. Pass
+ * RESCTRL_PICK_ANY_CPU to avoid excluding any CPUs.
+ *
+ * When a CPU is excluded, returns >= nr_cpu_ids if no CPUs are available.
*/
-static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask)
+static inline unsigned int
+cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
{
unsigned int cpu, hk_cpu;
- cpu = cpumask_any(mask);
- if (!tick_nohz_full_cpu(cpu))
+ if (exclude_cpu == RESCTRL_PICK_ANY_CPU)
+ cpu = cpumask_any(mask);
+ else
+ cpu = cpumask_any_but(mask, exclude_cpu);
+
+ if (!IS_ENABLED(CONFIG_NO_HZ_FULL))
+ return cpu;

It is not clear to me how cpumask_any_but() failure is handled.

cpumask_any_but() returns ">= nr_cpu_ids if no cpus set" ...

It wasn't a satisfiable request, there are no CPUs for this domain other than the one that
was excluded. cpumask_any_but() also describes its errors as "returns >= nr_cpu_ids if no
CPUs are available".

The places this can happen in resctrl are:
cqm_setup_limbo_handler(), where it causes the schedule_delayed_work_on() call to be skipped.
mbm_setup_overflow_handler(), which does similar.

These two cases are triggered from resctrl_offline_cpu() when the last CPU in a domain is
going offline, and the domain is about to be free()d. This is how the limbo/overflow
'threads' stop.

Right ... yet this is a generic function, if there are any requirements on when/how it should
be called then it needs to be specified in the function comments. I do not expect this to
be the case for this function.

There are no special requirements, like all the other cpumask_foo() helpers, you can feed it
an empty bitmap and it will return '>= nr_cpu_ids' as an error.

[...]

But that would have
code continue ... so maybe it needs explicit error check of
cpumask_any_but() failure with an earlier exit?

I'm not sure what the problem you refer to here is.
If 'cpu' is valid, and not marked nohz_full, nothing more needs doing.
If 'cpu' is invalid or a CPU marked nohz_full, then a second attempt is made to find a
housekeeping CPU into 'hk_cpu'. If the second attempt is valid, it's used in preference.

Considering that the second attempt can only be on the same or smaller set of CPUs,
how could the second attempt ever succeed if the first attempt failed? I do not see
why it is worth continuing.

Its harmless, its not on a performance sensitive path and it would take extra logic in the
more common cases to detect this and return early.


Thanks,

James