Re: [PATCH v2 19/26] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

From: Reinette Chatre
Date: Tue Nov 28 2023 - 17:18:25 EST


Hi Ilpo,

On 11/20/2023 3:13 AM, Ilpo Järvinen wrote:
> CAT test spawns two processes into two different control groups with
> exclusive schemata. Both the processes alloc a buffer from memory
> matching their allocated LLC block size and flush the entire buffer out
> of caches. Since the processes are reading through the buffer only once
> during the measurement and initially all the buffer was flushed, the
> test isn't testing CAT.
>
> Rewrite the CAT test to allocate a buffer sized to half of LLC. Then
> perform a sequence of tests with different LLC alloc sizes starting
> from half of the CBM bits down to 1-bit CBM. Flush the buffer before
> each test and read the buffer twice. Observe the LLC misses on the
> second read through the buffer. As the allocated LLC block gets smaller
> and smaller, the LLC misses will become larger and larger giving a
> strong signal on CAT working properly.
>
> The new CAT test is using only a single process because it relies on
> measured effect against another run of itself rather than another
> process adding noise. The rest of the system is set to use the CBM bits
> not used by the CAT test to keep the test isolated.
>
> Replace count_bits() with count_contiguous_bits() to get the first bit
> position in order to be able to calculate masks based on it.
>
> This change has been tested with a number of systems from different
> generations.
>
> Suggested-by: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 282 +++++++++-----------
> tools/testing/selftests/resctrl/fill_buf.c | 6 +-
> tools/testing/selftests/resctrl/resctrl.h | 5 +-
> tools/testing/selftests/resctrl/resctrlfs.c | 44 +--
> 4 files changed, 138 insertions(+), 199 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index cfda87667b46..4169b17b8f91 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -11,65 +11,69 @@
> #include "resctrl.h"
> #include <unistd.h>
>
> -#define RESULT_FILE_NAME1 "result_cat1"
> -#define RESULT_FILE_NAME2 "result_cat2"
> +#define RESULT_FILE_NAME "result_cat"
> #define NUM_OF_RUNS 5
> -#define MAX_DIFF_PERCENT 4
> -#define MAX_DIFF 1000000
>
> /*
> - * Change schemata. Write schemata to specified
> - * con_mon grp, mon_grp in resctrl FS.
> - * Run 5 times in order to get average values.
> + * Minimum difference in LLC misses between a test with n+1 bits CBM to the
> + * test with n bits is MIN_DIFF_PERCENT_PER_BIT * (n - 1). With e.g. 5 vs 4
> + * bits in the CBM mask, the minimum difference must be at least
> + * MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
> + *
> + * The relationship between number of used CBM bits and difference in LLC
> + * misses is not expected to be linear. With a small number of bits, the
> + * margin is smaller than with larger number of bits. For selftest purposes,
> + * however, linear approach is enough because ultimately only pass/fail
> + * decision has to be made and distinction between strong and stronger
> + * signal is irrelevant.
> */
> -static int cat_setup(struct resctrl_val_param *p)
> -{
> - char schemata[64];
> - int ret = 0;
> -
> - /* Run NUM_OF_RUNS times */
> - if (p->num_of_runs >= NUM_OF_RUNS)
> - return END_OF_TESTS;
> -
> - if (p->num_of_runs == 0) {
> - sprintf(schemata, "%lx", p->mask);
> - ret = write_schemata(p->ctrlgrp, schemata, p->cpu_no,
> - p->resctrl_val);
> - }
> - p->num_of_runs++;
> -
> - return ret;
> -}
> +#define MIN_DIFF_PERCENT_PER_BIT 1
>
> static int show_results_info(__u64 sum_llc_val, int no_of_bits,
> - unsigned long cache_span, unsigned long max_diff,
> - unsigned long max_diff_percent, unsigned long num_of_runs,
> - bool platform)
> + unsigned long cache_span, long min_diff_percent,

With all care taken in unsigned use I wonder why min_diff_percent is just long?

> + unsigned long num_of_runs, bool platform,
> + __s64 *prev_avg_llc_val)
> {
> __u64 avg_llc_val = 0;
> - float diff_percent;
> - int ret;
> + float avg_diff;
> + int ret = 0;
>
> avg_llc_val = sum_llc_val / num_of_runs;
> - diff_percent = ((float)cache_span - avg_llc_val) / cache_span * 100;
> + if (*prev_avg_llc_val) {
> + float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);
>
> - ret = platform && abs((int)diff_percent) > max_diff_percent;
> + avg_diff = delta / *prev_avg_llc_val;
> + ret = platform && (avg_diff * 100) < (float)min_diff_percent;
>
> - ksft_print_msg("%s Check cache miss rate within %lu%%\n",
> - ret ? "Fail:" : "Pass:", max_diff_percent);
> + ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
> + ret ? "Fail:" : "Pass:", (float)min_diff_percent);
>
> - ksft_print_msg("Percent diff=%d\n", abs((int)diff_percent));
> + ksft_print_msg("Percent diff=%.1f\n", avg_diff * 100);
> + }
> + *prev_avg_llc_val = avg_llc_val;
>
> show_cache_info(no_of_bits, avg_llc_val, cache_span, true);
>
> return ret;
> }
>
> -static int check_results(struct resctrl_val_param *param, size_t span)
> +/* Remove one bit from the consecutive cbm mask */

The use of "consecutive" is not clear. In this usage it removes one
bit from the previous mask in order to create the consecutive mask, no?

> +static unsigned long next_mask(unsigned long current_mask)
> +{
> + return current_mask & (current_mask >> 1);
> +}
> +

The new test looks very good to me. Thank you very much for creating it.

It looks to me as though this test impacts the affinity of main program
since it is only one process, changes its affinity, but never change it back.

Reinette