Re: [PATCH v5 20/24] x86/resctrl: Add cpu online callback for resctrl work

From: Reinette Chatre
Date: Wed Aug 09 2023 - 18:38:23 EST


Hi James,

(please also check subject lines for CPU instead of cpu)

On 7/28/2023 9:42 AM, James Morse wrote:
> The resctrl architecture specific code may need to create a domain when
> a CPU comes online, it also needs to reset the CPUs PQR_ASSOC register.
> The resctrl filesystem code needs to update the rdtgroup_default CPU
> mask when CPUs are brought online.
>
> Currently this is all done in one function, resctrl_online_cpu().
> This will need to be split into architecture and filesystem parts
> before resctrl can be moved to /fs/.
>
> Pull the rdtgroup_default update work out as a filesystem specific
> cpu_online helper. resctrl_online_cpu() is the obvious name for this,
> which means the version in core.c needs renaming.
>
> resctrl_online_cpu() is called by the arch code once it has done the
> work to add the new CPU to any domains.
>
> In future patches, resctrl_online_cpu() will take the rdtgroup_mutex
> itself.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxx>
> Signed-off-by: James Morse <james.morse@xxxxxxx>
> ---
> Changes since v3:
> * Renamed err to ret
>
> Changes since v4:
> * Changes in capitalisation.
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 11 ++++++-----
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++++++++++
> include/linux/resctrl.h | 1 +
> 3 files changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 8dfede01b0c9..a694563d3929 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -603,19 +603,20 @@ static void clear_closid_rmid(int cpu)
> wrmsr(MSR_IA32_PQR_ASSOC, 0, RESCTRL_RESERVED_CLOSID);
> }
>
> -static int resctrl_online_cpu(unsigned int cpu)
> +static int resctrl_arch_online_cpu(unsigned int cpu)
> {
> struct rdt_resource *r;
> + int ret;
>
> mutex_lock(&rdtgroup_mutex);
> for_each_capable_rdt_resource(r)
> domain_add_cpu(cpu, r);
> - /* The cpu is set in default rdtgroup after online. */
> - cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask);
> clear_closid_rmid(cpu);
> +
> + ret = resctrl_online_cpu(cpu);
> mutex_unlock(&rdtgroup_mutex);
>
> - return 0;
> + return ret;
> }

This is unexpected that resctrl_online_cpu() returns an error ... and
then the caller exits with failure without error handling or unwinding
the previous work. Is this error return needed? The function just
always returns zero so it looks like it could just be void.

>
> static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
> @@ -965,7 +966,7 @@ static int __init resctrl_late_init(void)
>
> state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN,
> "x86/resctrl/cat:online:",
> - resctrl_online_cpu, resctrl_offline_cpu);
> + resctrl_arch_online_cpu, resctrl_offline_cpu);
> if (state < 0)
> return state;
>
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index fef78a3dc632..7bd3a3dc0f44 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -3868,6 +3868,16 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d)
> return 0;
> }
>
> +int resctrl_online_cpu(unsigned int cpu)
> +{
> + lockdep_assert_held(&rdtgroup_mutex);
> +
> + /* The CPU is set in default rdtgroup after online. */
> + cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask);
> +
> + return 0;
> +}
> +
> /*
> * rdtgroup_init - rdtgroup initialization
> *
> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 5e4b4df9610b..35d3c97df212 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -223,6 +223,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d,
> u32 closid, enum resctrl_conf_type type);
> int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d);
> void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d);
> +int resctrl_online_cpu(unsigned int cpu);
>
> /**
> * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid


Reinette