Re: [PATCH v10 1/8] x86/resctrl: Prepare for new domain scope

From: Reinette Chatre
Date: Mon Nov 06 2023 - 19:32:05 EST


Hi Tony,

On 10/31/2023 2:17 PM, Tony Luck wrote:
> Resctrl resources operate on subsets of CPUs in the system with the
> defining attribute of each subset being an instance of a particular
> level of cache. E.g. all CPUs sharing an L3 cache would be part of the
> same domain.
>
> In preparation for features that are scoped at the NUMA node level
> change the code from explicit references to "cache_level" to a more
> generic scope. At this point the only options for this scope are groups
> of CPUs that share an L2 cache or L3 cache.
>
> Clean up the error handling when looking up domains. Report invalid id's
> before calling rdt_find_domain() in preparation for better messages when
> scope can be other than cache scope. This means that rdt_find_domain()
> will never return an error. So remove checks for error from the callsites.
>
> Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> ---
> Changes since v9
> New test for invalid domain id before calling rdt_find_domain() means that
> error handling in that function and at all call-sites can be simplified.

These changes do not appear to be consistent in this series. Simplifying the
call-sites is indeed done in this patch but this work seems to be undone in
patch 3 where it reverts back to the previous error handling in
domain_add_cpu_mon(), domain_remove_cpu_ctrl(), and domain_remove_cpu_mon().

> In pseudo_lock_region_init() use the new enum resctrl_scope for local variable.
>
> include/linux/resctrl.h | 9 +++--
> arch/x86/kernel/cpu/resctrl/core.c | 40 +++++++++++++++--------
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
> arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 +++-
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 5 ++-
> 5 files changed, 44 insertions(+), 18 deletions(-)
>

...

> @@ -506,17 +516,18 @@ static int arch_domain_mbm_alloc(u32 num_rmid, struct rdt_hw_domain *hw_dom)
> */
> static void domain_add_cpu(int cpu, struct rdt_resource *r)
> {
> - int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> + int id = get_domain_id_from_scope(cpu, r->scope);
> struct list_head *add_pos = NULL;
> struct rdt_hw_domain *hw_dom;
> struct rdt_domain *d;
> int err;
>
> - d = rdt_find_domain(r, id, &add_pos);
> - if (IS_ERR(d)) {
> - pr_warn("Couldn't find cache id for CPU %d\n", cpu);
> + if (id < 0) {
> + pr_warn_once("Can't find domain id for CPU:%d scope:%d for resource %s\n",
> + cpu, r->scope, r->name);
> return;
> }

Please add empty line here ...

> + d = rdt_find_domain(r, id, &add_pos);
>

... and remove this empty line.

> if (d) {
> cpumask_set_cpu(cpu, &d->cpu_mask);
> @@ -556,12 +567,15 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r)
>
> static void domain_remove_cpu(int cpu, struct rdt_resource *r)
> {
> - int id = get_cpu_cacheinfo_id(cpu, r->cache_level);
> + int id = get_domain_id_from_scope(cpu, r->scope);
> struct rdt_hw_domain *hw_dom;
> struct rdt_domain *d;
>
> + if (id < 0)
> + return;
> +
> d = rdt_find_domain(r, id, NULL);
> - if (IS_ERR_OR_NULL(d)) {
> + if (!d) {
> pr_warn("Couldn't find cache id for CPU %d\n", cpu);

This error message is no longer accurate.

> return;
> }


Reinette