Re: [PATCH v4 6/7] x86/resctrl: Update documentation with Sub-NUMA cluster changes

From: Tony Luck
Date: Fri Aug 25 2023 - 13:51:11 EST


On Fri, Aug 11, 2023 at 10:33:18AM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 7/22/2023 12:07 PM, Tony Luck wrote:
> > With Sub-NUMA Cluster mode enabled the scope of monitoring resources is
> > per-NODE instead of per-L3 cache. Suffixes of directories with "L3" in
> > their name refer to Sub-NUMA nodes instead of L3 cache ids.
> >
> > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> > Reviewed-by: Peter Newman <peternewman@xxxxxxxxxx>
> > ---
> > Documentation/arch/x86/resctrl.rst | 10 +++++++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/Documentation/arch/x86/resctrl.rst b/Documentation/arch/x86/resctrl.rst
> > index cb05d90111b4..4d9ddb91751d 100644
> > --- a/Documentation/arch/x86/resctrl.rst
> > +++ b/Documentation/arch/x86/resctrl.rst
> > @@ -345,9 +345,13 @@ When control is enabled all CTRL_MON groups will also contain:
> > When monitoring is enabled all MON groups will also contain:
> >
> > "mon_data":
> > - This contains a set of files organized by L3 domain and by
> > - RDT event. E.g. on a system with two L3 domains there will
> > - be subdirectories "mon_L3_00" and "mon_L3_01". Each of these
> > + This contains a set of files organized by L3 domain or by NUMA
> > + node (depending on whether Sub-NUMA Cluster (SNC) mode is disabled
> > + or enabled respectively) and by RDT event. E.g. on a system with
> > + SNC mode disabled with two L3 domains there will be subdirectories
> > + "mon_L3_00" and "mon_L3_01". The numerical suffix refers to the
> > + L3 cache id. With SNC enabled the directory names are the same,
> > + but the numerical suffix refers to the node id. Each of these
> > directories have one file per event (e.g. "llc_occupancy",
> > "mbm_total_bytes", and "mbm_local_bytes"). In a MON group these
> > files provide a read out of the current value of the event for
>
> I think it would be helpful to add a modified version of the snippet
> (from previous patch changelog) regarding well-behaved NUMA apps.
> With the above it may be confusing that a single cache allocation has
> multiple cache occupancy counters.
>
> This also changes the meaning of the numbers in the directory names.
> The documentation already provides guidance on how to find the cache
> ID of a logical CPU (see section "Cache IDs"). I think it will be
> helpful to add a snippet that makes it clear to users how to map
> a CPU to its node ID.

Added extra details as suggested.

>
> Reinette

-Tony