Re: [PATCH] selftests/resctrl: Add non-contiguous CBMs CAT test

From: Ilpo Järvinen
Date: Fri Nov 17 2023 - 07:56:04 EST


On Thu, 9 Nov 2023, Maciej Wieczor-Retman wrote:

> Non-contiguous CBM support for Intel CAT has been merged into the kernel
> with Commit 0e3cd31f6e90 ("x86/resctrl: Enable non-contiguous CBMs in
> Intel CAT") but there is no selftest that would validate if this feature
> works correctly.
>
> The selftest needs to verify if writing non-contiguous CBMs to the
> schemata file behaves as expected in comparison to the information about
> non-contiguous CBMs support.
>
> Add tests for both L2 and L3 CAT to verify if the return values
> generated by writing non-contiguous CBMs don't contradict the
> reported non-contiguous support information.

"if ... don't" sounds weird to me. Perhaps the "if" could just be dropped
from it.

> Comparing the return value of write_schemata() with non-contiguous CBMs
> support information can be simplified as a logical XOR operation. In
> other words if non-contiguous CBMs are supported and if non-contiguous
> write succeeds the test should succeed and if the write fails the test
> should also fail. The opposite should happen if non-contiguous CBMs are
> not supported.

To me this sounds a bit verbose given how basic thing it talks about
(but maybe I'm too old already to have actually come across a few xor
tricks in the past :-)). I'd simplify it to (or simply drop it):

Use a logical XOR to confirm return value of write_schemata() and
non-contiguous CBMs support information match.

> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
>
> ---
>
> This patch is based on a rework of resctrl selftests that's currently in
> review [1]. The patch also implements a similiar functionality presented
> in the bash script included in the cover letter to the original
> non-contiguous CBMs in Intel CAT series [2].
>
> [1] https://lore.kernel.org/all/20231024092634.7122-1-ilpo.jarvinen@xxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/all/cover.1696934091.git.maciej.wieczor-retman@xxxxxxxxx/
>
> tools/testing/selftests/resctrl/cat_test.c | 97 +++++++++++++++++++
> tools/testing/selftests/resctrl/resctrl.h | 2 +
> .../testing/selftests/resctrl/resctrl_tests.c | 2 +
> 3 files changed, 101 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index bc88eb891f35..6a01a5da30b4 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -342,6 +342,87 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
> return ret;
> }
>
> +static int noncont_cat_run_test(const struct resctrl_test *test,
> + const struct user_params *uparams)
> +{
> + unsigned long full_cache_mask, cont_mask, noncont_mask;
> + unsigned int eax, ebx, ecx, edx, ret, sparse_masks;
> + char res_path[PATH_MAX];
> + char schemata[64];
> + int bit_center;
> + FILE *fp;
> +
> + /* Check to compare sparse_masks content to cpuid output. */
> + snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH,
> + test->resource, "sparse_masks");
> +
> + fp = fopen(res_path, "r");
> + if (!fp) {
> + perror("# Error in opening file\n");
> + return errno;
> + }
> +
> + if (fscanf(fp, "%u", &sparse_masks) <= 0) {
> + perror("Could not get sparse_masks contents\n");
> + fclose(fp);
> + return -1;
> + }
> +
> + fclose(fp);

Add a function to do this conversion into resctrlfs.c.

> +
> + if (!strcmp(test->resource, "L3"))
> + __cpuid_count(0x10, 1, eax, ebx, ecx, edx);
> + else if (!strcmp(test->resource, "L2"))
> + __cpuid_count(0x10, 2, eax, ebx, ecx, edx);
> + else
> + return -EINVAL;

This would be same as (you need to make the func non-static though):
level = get_cache_level(test->resource);
if (level < 0)
return -EINVAL;
__cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);

> + if (sparse_masks != ((ecx >> 3) & 1))
> + return -1;
> +
> + /* Write checks initialization. */
> + ret = get_cbm_mask(test->resource, &full_cache_mask);
> + if (ret < 0)
> + return ret;
> + bit_center = count_bits(full_cache_mask) / 2;
> + cont_mask = full_cache_mask >> bit_center;
> +
> + /* Contiguous mask write check. */
> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
> + if (ret)
> + return ret;
> +
> + /*
> + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle.
> + * Output is compared with support information to catch any edge case errors.
> + */
> + noncont_mask = ~(full_cache_mask & (0xf << bit_center)) & full_cache_mask;

Why is the full_cache_mask & part needed here? It's not like the second
and can grow bits outside of full_cache_mask even if that would overflow
the full_cache_mask (it won't be testing hole then though but that
problem happens also at the boundary condition one bit prior to
overflowing the mask).

> + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask);
> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
> + if (ret && sparse_masks)
> + ksft_print_msg("Non-contiguous CBMs supported but write failed\n");
> + else if (ret && !sparse_masks)
> + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n");
> + else if (!ret && !sparse_masks)
> + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n");

Newline.

> + return !ret == !sparse_masks;
> +}
> +
> +static bool noncont_cat_feature_check(const struct resctrl_test *test)
> +{
> + char res_path[PATH_MAX];
> + struct stat statbuf;
> +
> + snprintf(res_path, sizeof(res_path), "%s/%s/%s", INFO_PATH,
> + test->resource, "sparse_masks");
> +
> + if (stat(res_path, &statbuf))
> + return false;

This looks generic enough that validate_resctrl_feature_request() should
be somehow adapted to cover also these cases. Perhaps it would be best to
just split validate_resctrl_feature_request() into multiple functions.

> + return test_resource_feature_check(test);
> +}
> +
> struct resctrl_test l3_cat_test = {
> .name = "L3_CAT",
> .group = "CAT",
> @@ -357,3 +438,19 @@ struct resctrl_test l2_cat_test = {
> .feature_check = test_resource_feature_check,
> .run_test = cat_run_test,
> };
> +
> +struct resctrl_test l3_noncont_cat_test = {
> + .name = "L3_NONCONT_CAT",
> + .group = "NONCONT_CAT",

> +struct resctrl_test l2_noncont_cat_test = {
> + .name = "L2_NONCONT_CAT",
> + .group = "NONCONT_CAT",

I think these should be grouped among "CAT" group because it well, tests
CAT functionality. Why you think a separate group for them is needed?

--
i.