Re: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

From: Ilpo Järvinen
Date: Wed Apr 26 2023 - 09:59:02 EST


On Fri, 21 Apr 2023, Reinette Chatre wrote:

> Hi Ilpo,
>
> On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:
> > CAT test spawns two processes into two different control groups with
> > exclusive schemata. Both the processes alloc a buffer from memory
> > matching their allocated LLC block size and flush the entire buffer out
> > of caches. Since the processes are reading through the buffer only once
> > during the measurement and initially all the buffer was flushed, the
> > test isn't testing CAT.
> >
> > Rewrite the CAT test to allocated a buffer sized to half of LLC. Then
>
> "allocated a buffer" -> "allocate a buffer" ?
>
> > perform a sequence of tests with different LLC alloc sizes starting
> > from half of the CBM bits down to 1-bit CBM. Flush the buffer before
> > each test and read the buffer twice. Observe the LLC misses on the
> > second read through the buffer. As the allocated LLC block gets smaller
> > and smaller, the LLC misses will become larger and larger giving a
> > strong signal on CAT working properly.
>
> Since the changelog starts by describing the CAT test needing two
> processes I think it would help to highlight that this test uses a
> single process. I think it would also help to describing how the cache
> is used by the rest while this test is running.

Sure, good points, I'll add the info.

> > Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> > ---
> > tools/testing/selftests/resctrl/cache.c | 20 +-
> > tools/testing/selftests/resctrl/cat_test.c | 204 +++++++++------------
> > 2 files changed, 97 insertions(+), 127 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
> > index 7970239413da..64f08ba5edc2 100644
> > --- a/tools/testing/selftests/resctrl/cache.c
> > +++ b/tools/testing/selftests/resctrl/cache.c
> > @@ -224,10 +224,10 @@ int measure_llc_resctrl(struct resctrl_val_param *param, int bm_pid)
> > */
> > int cat_val(struct resctrl_val_param *param)
> > {
> > - int memflush = 1, operation = 0, ret = 0;
> > char *resctrl_val = param->resctrl_val;
> > unsigned long llc_perf_miss = 0;
> > pid_t bm_pid;
> > + int ret;
> >
> > if (strcmp(param->filename, "") == 0)
> > sprintf(param->filename, "stdio");
> > @@ -245,6 +245,10 @@ int cat_val(struct resctrl_val_param *param)
> > if (ret)
> > return ret;
> >
> > + ret = alloc_buffer(param->span, 1);
> > + if (ret)
> > + return ret;
> > +
> > initialize_llc_perf();
> >
> > /* Test runs until the callback setup() tells the test to stop. */
> > @@ -256,17 +260,15 @@ int cat_val(struct resctrl_val_param *param)
> > }
> > if (ret < 0)
> > break;
> > +
> > + flush_buffer(param->span);
> > + use_buffer(param->span, 0, true);
> > +
> > ret = reset_enable_llc_perf(bm_pid, param->cpu_no);
> > if (ret)
> > break;
> >
> > - if (run_fill_buf(param->span, memflush, operation, true)) {
> > - fprintf(stderr, "Error-running fill buffer\n");
> > - ret = -1;
> > - break;
> > - }
> > -
> > - sleep(1);
> > + use_buffer(param->span, 0, true);
> >
> > /* Measure cache miss from perf */
> > ret = get_llc_perf(&llc_perf_miss);
> > @@ -279,6 +281,8 @@ int cat_val(struct resctrl_val_param *param)
> > break;
> > }
> >
> > + free_buffer();
> > +
> > return ret;
> > }
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index 4b505fdb35d7..85053829b9c5 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -11,11 +11,12 @@
> > #include "resctrl.h"
> > #include <unistd.h>
> >
> > -#define RESULT_FILE_NAME1 "result_cat1"
> > -#define RESULT_FILE_NAME2 "result_cat2"
> > -#define NUM_OF_RUNS 5
> > -#define MAX_DIFF_PERCENT 4
> > -#define MAX_DIFF 1000000
> > +#define RESULT_FILE_NAME "result_cat"
> > +#define NUM_OF_RUNS 5
> > +#define MIN_DIFF_PERCENT_PER_BIT 2
>
> Could you please start a new trend that adds documentation
> that explains what this constant means and how it was chosen?

I can try although that particular 2 was a bit handwavy that just seems to
work with the tests I performed.

> > +static unsigned long current_mask;
> > +static long prev_avg_llc_val;
> >
> > /*
> > * Change schemata. Write schemata to specified
> > @@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p)
> > int ret = 0;
> >
> > /* Run NUM_OF_RUNS times */
> > - if (p->num_of_runs >= NUM_OF_RUNS)
> > - return END_OF_TESTS;
> > + if (p->num_of_runs >= NUM_OF_RUNS) {
> > + /* Remove one bit from the consecutive block */
> > + current_mask &= current_mask >> 1;
> > + if (!current_mask)
> > + return END_OF_TESTS;
> > +
> > + p->num_of_runs = 0;
>
> This seems like a workaround to get the schemata to be written. It is
> problematic since now p->num_of_runs no longer accurately reflects the
> number of test runs.

This is already the case. MBA test works around this very same problem by
using a custom static variable (runs_per_allocation) which is reset to 0
every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test
would replace runs_per_allocation with use of ->num_of_runs, it would
match what the new CAT test does.

Nothing currently relies on ->num_of_runs counting across the different
"tests" that are run inside CAT and MBA tests. And I don't have anything
immediately around the corner that would require ->num_of_runs to count
total number of repetitions that were ran.

I guess it would be possible to attempt to consolidate that second layer
MBA and the rewritten CAT tests need somehow into resctrl_val_param. But
IMHO that too is low-prio refactor as nothing is broken as is.

> I was expecting this mask manipulation to be
> in cat_val() so that it is clear how test works instead of part
> of the logic handled here.

That seems to be moving into opposite direction from how things are
currently handled. Doing it in cat_val() would be relying less on
->setup(). If that's the preferred direction, then the question becomes,
should CAT test do anything in ->setup() because also the schemata
writing could be done in directly cat_val().

What I would prefer not to do is to have a rule which says: if there's a
test-specific function, don't use ->setup() but do any setup directly
in the test-specific function but, otherwise use ->setup(). Such an
inconsistency would make things hard to track.

> > + }
> >
> > if (p->num_of_runs == 0) {
> > - sprintf(schemata, "%lx", p->mask);
> > - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
> > - p->resctrl_val);
> > + snprintf(schemata, sizeof(schemata), "%lx", p->mask & ~current_mask);
> > + ret = write_schemata("", schemata, p->cpu_no, p->resctrl_val);
> > + if (ret)
> > + return ret;
> > + snprintf(schemata, sizeof(schemata), "%lx", current_mask);
> > + ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no, p->resctrl_val);
> > + if (ret)
> > + return ret;
> > }
> > p->num_of_runs++;
> >
>
> ...
>
> > @@ -126,7 +162,7 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > ret = get_mask_no_shareable(cache_type, &long_mask);
> > if (ret)
> > return ret;
> > - count_of_bits = count_consecutive_bits(long_mask, NULL);
> > + count_of_bits = count_consecutive_bits(long_mask, &start);
> >
> > /* Get L3/L2 cache size */
> > ret = get_cache_size(cpu_no, cache_type, &cache_size);
> > @@ -143,99 +179,29 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > count_of_bits - 1);
> > return -1;
> > }
> > -
> > - /* Get core id from same socket for running another thread */
> > - sibling_cpu_no = get_core_sibling(cpu_no);
>
> Do any users of get_core_sibling() remain after this?

Correct observation, there seems to be no other users after this is
removed.

--
i.