[PATCH v9 0/8] Add support for Sub-NUMA cluster (SNC) systems

From: Tony Luck
Date: Fri Oct 20 2023 - 17:31:45 EST


The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
that share an L3 cache into two or more sets. This plays havoc with the
Resource Director Technology (RDT) monitoring features. Prior to this
patch Intel has advised that SNC and RDT are incompatible.

Some of these CPU support an MSR that can partition the RMID counters in
the same way. This allows monitoring features to be used. With the caveat
that users must be aware that Linux may migrate tasks more frequently
between SNC nodes than between "regular" NUMA nodes, so reading counters
from all SNC nodes may be needed to get a complete picture of activity
for tasks.

Cache and memory bandwidth allocation features continue to operate at
the scope of the L3 cache.

Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>

Changes since v6 (see individual patches for specifics):

v7 - had some git format-patch disaster and one of the patches couldn't
be applied.

v8 - Was rushed. Somehow I booted the wrong kernel while testing and
let escape a brown-paper-bag bug that crashed duing boot.
Sincere apologies to all who wasted time reading this series,
or trying to boot it.

v9 - Tested (Really! I checked timestamps in dmesg, and all sorts of
other checks to make sure I was really looking at a kernel built
with these patches).

Rebased to tip/master October 20th since that has several other
resctrl changes staged resdy for next merge window. No
significant collisions, just noise where "git am" would not
automatically apply. New base is:

3300447612b2 ("Merge branch into tip/master: 'x86/tdx'")

Fixed the brown-paper-bag bug from v8.

Added Peter's "Reviewed-by" where offered (except on patch 3
which had the aforementioned bug).

Tony Luck (8):
x86/resctrl: Prepare for new domain scope
x86/resctrl: Prepare to split rdt_domain structure
x86/resctrl: Prepare for different scope for control/monitor
operations
x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
x86/resctrl: Add node-scope to the options for feature scope
x86/resctrl: Introduce snc_nodes_per_l3_cache
x86/resctrl: Sub NUMA Cluster detection and enable
x86/resctrl: Update documentation with Sub-NUMA cluster changes

Documentation/arch/x86/resctrl.rst | 23 +-
include/linux/resctrl.h | 85 +++--
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 66 ++--
arch/x86/kernel/cpu/resctrl/core.c | 402 +++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 58 ++--
arch/x86/kernel/cpu/resctrl/monitor.c | 58 ++--
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 14 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 132 +++----
9 files changed, 592 insertions(+), 247 deletions(-)


base-commit: 3300447612b2adbc05cbb90e5d1cb288f19c40c6
--
2.41.0