Re: [PATCH v1 04/20] x86/resctrl: Add domain offline callback for resctrl work

From: Jamie Iles
Date: Wed Aug 11 2021 - 12:10:46 EST


Hi James,

On Thu, Jul 29, 2021 at 10:35:54PM +0000, James Morse wrote:
> Because domains are exposed to user-space via resctrl, the filesystem
> must update its state when cpu hotplug callbacks are triggered.
>
> Some of this work is common to any architecture that would support
> resctrl, but the work is tied up with the architecture code to
> free the memory.
>
> Move the monitor subdir removal and the cancelling of the mbm/limbo
> works into a new resctrl_offline_domain() call.
>
> Signed-off-by: James Morse <james.morse@xxxxxxx>
> ---
> arch/x86/kernel/cpu/resctrl/core.c | 24 +---------------
> arch/x86/kernel/cpu/resctrl/internal.h | 2 --
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 39 +++++++++++++++++++++++---
> include/linux/resctrl.h | 1 +
> 4 files changed, 37 insertions(+), 29 deletions(-)
...
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index e1af1d81b924..cf0db0b7a5d0 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
...
> @@ -3229,6 +3227,39 @@ static int __init rdtgroup_setup_root(void)
> return ret;
> }
>
> +void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d)
> +{
> + lockdep_assert_held(&rdtgroup_mutex); // the arch code took this for us
> +
> + if (!r->mon_capable)
> + return;
> +
> + /*
> + * If resctrl is mounted, remove all the
> + * per domain monitor data directories.
> + */
> + if (static_branch_unlikely(&rdt_mon_enable_key))
> + rmdir_mondata_subdir_allrdtgrp(r, d->id);
> +
> + if (r->mon_capable && is_mbm_enabled())
> + cancel_delayed_work(&d->mbm_over);

There's a redundant r->mon_capable check here.

> + if (is_llc_occupancy_enabled() && has_busy_rmid(r, d)) {
> + /*
> + * When a package is going down, forcefully
> + * decrement rmid->ebusy. There is no way to know
> + * that the L3 was flushed and hence may lead to
> + * incorrect counts in rare scenarios, but leaving
> + * the RMID as busy creates RMID leaks if the
> + * package never comes back.
> + */
> + __check_limbo(d, true);
> + cancel_delayed_work(&d->cqm_limbo);
> + }
> + bitmap_free(d->rmid_busy_llc);
> + kfree(d->mbm_total);
> + kfree(d->mbm_local);
> +}

Thanks,

Jamie