Re: [PATCH v4 02/24] x86/resctrl: Access per-rmid structures by index

From: Reinette Chatre
Date: Thu Jun 15 2023 - 18:03:46 EST


Hi James,

On 5/25/2023 11:01 AM, James Morse wrote:
> Because of the differences between Intel RDT/AMD QoS and Arm's MPAM
> monitors, RMID values on arm64 are not unique unless the CLOSID is

I find the above a bit confusing ... the theme seems to be "RMID values
on arm64 are not unique because they are different from Intel".
Compare to: "One of the differences between Intel RDT/AMD QoS and
Arm's MPAM monitors is that RMID values on arm64 are not unique unless
the CLOSID is also included."

> also included. Bitmaps like rmid_busy_llc need to be sized by the
> number of unique entries for this resource.
>
> Add helpers to encode/decode the CLOSID and RMID to an index. The
> domain's rmid_busy_llc and rmid_ptrs[] are then sized by index,
> as are the domain mbm_local and mbm_total arrays.

You can use "[]" to indicate an array.

> On x86, the index is always just the RMID, so all these structures
> remain the same size.

I do not consider this accurate considering that the previous
patch increased the size of each element to support this change.

> The index gives resctrl a unique value it can use to store monitor
> values, and allows MPAM to decode the CLOSID when reading the hardware
> counters.
>
> Tested-by: Shaopeng Tan <tan.shaopeng@xxxxxxxxxxx>
> Signed-off-by: James Morse <james.morse@xxxxxxx>
> ---
> Changes since v1:
> * Added X86_BAD_CLOSID macro to make it clear what this value means
> * Added second WARN_ON() for closid checking, and made both _ONCE()
>
> Changes since v2:
> * Added RESCTRL_RESERVED_CLOSID
> * Removed a newline
> * Repharsed some comments
> * Renamed a variable 'ignore'd
> * Moved X86_RESCTRL_BAD_CLOSID to a previous patch
>
> Changes since v3:
> * Changed a variable name
> * Fixed various typos
> ---
> arch/x86/include/asm/resctrl.h | 17 ++++++
> arch/x86/kernel/cpu/resctrl/core.c | 2 +-
> arch/x86/kernel/cpu/resctrl/internal.h | 1 +
> arch/x86/kernel/cpu/resctrl/monitor.c | 84 +++++++++++++++++---------
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 ++-
> include/linux/resctrl.h | 3 +
> 6 files changed, 83 insertions(+), 31 deletions(-)
>

...

> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 86574adedd64..bcc25f5339c0 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -142,12 +142,29 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val)
> return val;
> }
>
> -static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid)
> +/*
> + * x86 and arm64 differ in their handling of monitoring.
> + * x86's RMID are an independent number, there is only one source of traffic
> + * with an RMID value of '1'.
> + * arm64's PMG extend the PARTID/CLOSID space, there are multiple sources of
> + * traffic with a PMG value of '1', one for each CLOSID, meaning the RMID
> + * value is no longer unique.
> + * To account for this, resctrl uses an index. On x86 this is just the RMID,
> + * on arm64 it encodes the CLOSID and RMID. This gives a unique number.
> + *
> + * The domain's rmid_busy_llc and rmid_ptrs are sized by index. The arch code

rmid_ptrs[]

> + * must accept an attempt to read every index.
> + */
> +static inline struct rmid_entry *__rmid_entry(u32 idx)
> {
> struct rmid_entry *entry;
> + u32 closid, rmid;
>
> - entry = &rmid_ptrs[rmid];
> - WARN_ON(entry->rmid != rmid);
> + entry = &rmid_ptrs[idx];
> + resctrl_arch_rmid_idx_decode(idx, &closid, &rmid);
> +
> + WARN_ON_ONCE(entry->closid != closid);
> + WARN_ON_ONCE(entry->rmid != rmid);
>
> return entry;
> }

...

> @@ -377,14 +399,16 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
>
> void free_rmid(u32 closid, u32 rmid)
> {
> + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid);
> struct rmid_entry *entry;
>
> - if (!rmid)
> - return;
> -
> lockdep_assert_held(&rdtgroup_mutex);
>
> - entry = __rmid_entry(closid, rmid);
> + /* do not allow the default rmid to be free'd */
> + if (!idx)
> + return;
> +

The interface seem to become blurry here. There are new
architecture specific encode/decode callbacks while at the same
time there are a few requirements:
- closid 0 and rmid 0 are reserved
- closid 0 and rmid 0 must map to index 0 (the architecture
callbacks thus do not have must freedom here ... they must
return index 0 for closid 0 and rmid 0, no?).

It does seem a bit strange that the one layer provides values (0,0)
to other layer while requiring a specific answer (0).

What if RESCTRL_RESERVED_RMID is also introduced and before encoding
the CLOSID and RMID the core first checks if it is a reserved entry
being freed and exit early if this is the case?


> + entry = __rmid_entry(idx);
>
> if (is_llc_occupancy_enabled())
> add_rmid_to_limbo(entry);

...

> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> index 7d80bae05f59..ff7452f644e4 100644
> --- a/include/linux/resctrl.h
> +++ b/include/linux/resctrl.h
> @@ -6,6 +6,9 @@
> #include <linux/list.h>
> #include <linux/pid.h>
>
> +/* CLOSID value used by the default control group */
> +#define RESCTRL_RESERVED_CLOSID 0
> +

#define RESCTRL_RESERVED_RMID 0 ?

> #ifdef CONFIG_PROC_CPU_RESCTRL
>
> int proc_resctrl_show(struct seq_file *m,


Reinette