Re: [PATCH v7 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu()

From: Moger, Babu
Date: Thu Nov 09 2023 - 15:52:29 EST




On 10/25/23 13:03, James Morse wrote:
> When a CPU is taken offline the resctrl filesystem code needs to check
> if it was the CPU nominated to perform the periodic overflow and limbo
> work. If so, another CPU needs to be chosen to do this work.
>
> This is currently done in core.c, mixed in with the code that removes
> the CPU from the domain's mask, and potentially free()s the domain.
>
> Move the migration of the overflow and limbo helpers into the filesystem
> code, into resctrl_offline_cpu(). As resctrl_offline_cpu() runs before
> the architecture code has removed the CPU from the domain mask, the
> callers need to be told which CPU is being removed, to avoid picking
> it as the new CPU. This uses the exclude_cpu feature previously
> added.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxx>
> Tested-by: Peter Newman <peternewman@xxxxxxxxxx>
> Reviewed-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxx>
> Reviewed-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Signed-off-by: James Morse <james.morse@xxxxxxx>

Reviewed-by: Babu Moger <babu.moger@xxxxxxx>

> ---
> Changes since v5:
> * Changed fir tree order of variables.
> * Added mon-capable check for cpu offline.
>
> No changes since v6
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 16 ----------------
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 18 ++++++++++++++++++
> 2 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 7d09b8d7c653..a90a07a5c876 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -582,22 +582,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>
> return;
> }
> -
> - if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) {
> - if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
> - cancel_delayed_work(&d->mbm_over);
> - /*
> - * temporary: exclude_cpu=-1 as this CPU has already
> - * been removed by cpumask_clear_cpu()d
> - */
> - mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU);
> - }
> - if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
> - has_busy_rmid(d)) {
> - cancel_delayed_work(&d->cqm_limbo);
> - cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU);
> - }
> - }
> }
>
> static void clear_closid_rmid(int cpu)
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 971a8397e243..d38b2fe6e3ca 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -4034,7 +4034,9 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu)
>
> void resctrl_offline_cpu(unsigned int cpu)
> {
> + struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
> struct rdtgroup *rdtgrp;
> + struct rdt_domain *d;
>
> lockdep_assert_held(&rdtgroup_mutex);
>
> @@ -4044,6 +4046,22 @@ void resctrl_offline_cpu(unsigned int cpu)
> break;
> }
> }
> +
> + if (!l3->mon_capable)
> + return;
> +
> + d = get_domain_from_cpu(cpu, l3);
> + if (d) {
> + if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
> + cancel_delayed_work(&d->mbm_over);
> + mbm_setup_overflow_handler(d, 0, cpu);
> + }
> + if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
> + has_busy_rmid(d)) {
> + cancel_delayed_work(&d->cqm_limbo);
> + cqm_setup_limbo_handler(d, 0, cpu);
> + }
> + }
> }
>
> /*

--
Thanks
Babu Moger