Re: [PATCH v4 7/7] selftests/resctrl: Adjust effective L3 cache size when SNC enabled

From: Tony Luck
Date: Fri Aug 25 2023 - 13:57:44 EST


On Fri, Aug 11, 2023 at 10:33:43AM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 7/22/2023 12:07 PM, Tony Luck wrote:
> > Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
> > nodes. Systems may support splitting into either two or four nodes.
> >
> > When SNC mode is enabled the effective amount of L3 cache available
> > for allocation is divided by the number of nodes per L3.
> >
> > Detect which SNC mode is active by comparing the number of CPUs
> > that share a cache with CPU0, with the number of CPUs on node0.
> >
> > Reported-by: "Shaopeng Tan (Fujitsu)" <tan.shaopeng@xxxxxxxxxxx>
> > Closes: https://lore.kernel.org/r/TYAPR01MB6330B9B17686EF426D2C3F308B25A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> This does not seem to be the case when looking at
> https://lore.kernel.org/all/TYAPR01MB6330A4EB3633B791939EA45E8B39A@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Correct. I'll drop the "Closes:" tag. I'm not sure what
the status is. Shaopeng didn't respond to my suggestion
to try "taskset(1)" when running the tests to check if
NUMA effects are causing the test to fail.

>
> > Signed-off-by: Tony Luck <tony.luck@xxxxxxxxx>
> > ---
> > tools/testing/selftests/resctrl/resctrl.h | 1 +
> > tools/testing/selftests/resctrl/resctrlfs.c | 57 +++++++++++++++++++++
> > 2 files changed, 58 insertions(+)
> >
> > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> > index 87e39456dee0..a8b43210b573 100644
> > --- a/tools/testing/selftests/resctrl/resctrl.h
> > +++ b/tools/testing/selftests/resctrl/resctrl.h
> > @@ -13,6 +13,7 @@
> > #include <signal.h>
> > #include <dirent.h>
> > #include <stdbool.h>
> > +#include <ctype.h>
> > #include <sys/stat.h>
> > #include <sys/ioctl.h>
> > #include <sys/mount.h>
> > diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> > index fb00245dee92..79eecbf9f863 100644
> > --- a/tools/testing/selftests/resctrl/resctrlfs.c
> > +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> > @@ -130,6 +130,61 @@ int get_resource_id(int cpu_no, int *resource_id)
> > return 0;
> > }
> >
> > +/*
> > + * Count number of CPUs in a /sys bit map
> > + */
> > +static int count_sys_bitmap_bits(char *name)
> > +{
> > + FILE *fp = fopen(name, "r");
> > + int count = 0, c;
> > +
> > + if (!fp)
> > + return 0;
> > +
> > + while ((c = fgetc(fp)) != EOF) {
> > + if (!isxdigit(c))
> > + continue;
> > + switch (c) {
> > + case 'f':
> > + count++;
> > + case '7': case 'b': case 'd': case 'e':
> > + count++;
> > + case '3': case '5': case '6': case '9': case 'a': case 'c':
> > + count++;
> > + case '1': case '2': case '4': case '8':
> > + count++;
> > + }
> > + }
> > + fclose(fp);
> > +
> > + return count;
> > +}
> > +
> > +/*
> > + * Detect SNC by compating #CPUs in node0 with #CPUs sharing LLC with CPU0
> > + * Try to get this right, even if a few CPUs are offline so that the number
> > + * of CPUs in node0 is not exactly half or a quarter of the CPUs sharing the
> > + * LLC of CPU0.
> > + */
> > +static int snc_ways(void)
> > +{
> > + int node_cpus, cache_cpus;
> > +
> > + node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
> > + cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
> > +
> > + if (!node_cpus || !cache_cpus) {
> > + fprintf(stderr, "Warning could not determine Sub-NUMA Cluster mode\n");
> > + return 1;
> > + }
> > +
> > + if (4 * node_cpus >= cache_cpus)
> > + return 4;
> > + else if (2 * node_cpus >= cache_cpus)
> > + return 2;
> > + return 1;
> > +}
> > +
> > /*
> > * get_cache_size - Get cache size for a specified CPU
> > * @cpu_no: CPU number
> > @@ -190,6 +245,8 @@ int get_cache_size(int cpu_no, char *cache_type, unsigned long *cache_size)
> > break;
> > }
> >
> > + if (cache_num == 3)
> > + *cache_size /= snc_ways();
> > return 0;
> > }
> >
>
> I am surprised that this small change is sufficient. The resctrl
> selftests are definitely not NUMA aware and the CAT and CMT tests
> are not taking that into account when picking CPUs to run on. From
> what I understand LLC occupancy counters need to be added in this
> scenario but I do not see that done either.

This is a first step (the tests are definitely going to fail if
they have incorrect information about the cache size).

For a fully reliable set of tests some major surgery will be required
to bind to CPUs and memory to control allocation and access.

>
> Reinette

-Tony