Re: [PATCH v2 4/4] selftests/resctrl: Add non-contiguous CBMs CAT test

From: Maciej Wieczór-Retman
Date: Wed Jan 17 2024 - 03:27:11 EST


Hi, thanks for the review!

On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
>Hi Maciej,
>
>On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>> Add tests for both L2 and L3 CAT to verify the return values
>> generated by writing non-contiguous CBMs don't contradict the
>> reported non-contiguous support information.
>>
>> Use a logical XOR to confirm return value of write_schemata() and
>> non-contiguous CBMs support information match.
>>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@xxxxxxxxx>
>> ---
>> Changelog v2:
>> - Redo the patch message. (Ilpo)
>> - Tidy up __cpuid_count calls. (Ilpo)
>> - Remove redundant AND in noncont_mask calculations (Ilpo)
>> - Fix bit_center offset.
>> - Add newline before function return. (Ilpo)
>> - Group non-contiguous tests with CAT tests. (Ilpo)
>> - Use a helper for reading sparse_masks file. (Ilpo)
>> - Make get_cache_level() available in other source files. (Ilpo)
>>
>> tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++
>> tools/testing/selftests/resctrl/resctrl.h | 3 +
>> .../testing/selftests/resctrl/resctrl_tests.c | 2 +
>> tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
>> 4 files changed, 81 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>> index 7dc7206b3b99..ecf553a89aae 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>> return ret;
>> }
>>
>> +static int noncont_cat_run_test(const struct resctrl_test *test,
>> + const struct user_params *uparams)
>> +{
>> + unsigned long full_cache_mask, cont_mask, noncont_mask;
>> + unsigned int eax, ebx, ecx, edx, ret;
>> + int level, bit_center, sparse_masks;
>> + char schemata[64];
>> +
>> + /* Check to compare sparse_masks content to cpuid output. */
>
>"cpuid" -> "CPUID" (to note it is an instruction)
>

Thanks, will change

>> + sparse_masks = read_info_res_file(test->resource, "sparse_masks");
>> + if (sparse_masks < 0)
>> + return sparse_masks;
>> +
>> + level = get_cache_level(test->resource);
>> + if (level < 0)
>> + return -EINVAL;
>> + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx);
>
>Please do not invent relationships. Please replace the "4 - level" with
>specific index used that depends on particular cache. The cache level
>may not even be needed, just use the resource to determine the correct
>index.

I'll move this back to 'if (!strcmp(test->resource, "L3") ... ' structure then.

>> +
>> + if (sparse_masks != ((ecx >> 3) & 1))
>> + return -1;
>
>Can a message be displayed to support the debugging this test failure?

Sure, that is a very good idea. I'll add ksft_perror() here.

>> +
>> + /* Write checks initialization. */
>> + ret = get_full_cbm(test->resource, &full_cache_mask);
>> + if (ret < 0)
>> + return ret;
>
>I assume this test failure relies on the error message from get_bit_mask()
>that is called via get_full_cbm()?

Yes, I thought adding more prints here might look redundant.

>> + bit_center = count_bits(full_cache_mask) / 2;
>> + cont_mask = full_cache_mask >> bit_center;
>> +
>> + /* Contiguous mask write check. */
>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
>> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
>> + if (ret)
>> + return ret;
>
>How will user know what failed? I am seeing this single test exercise a few scenarios
>and it is not obvious to me if the issue will be clear if this test,
>noncont_cat_run_test(), fails.

write_schemata() either succeeds with '0' or errors out with a negative value. If
the contiguous mask write fails, write_schemata should print out what was wrong
and I believe that the test will report an error rather than failure.

>> +
>> + /*
>> + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle.
>> + * Output is compared with support information to catch any edge case errors.
>> + */
>> + noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask;
>> + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask);
>> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
>> + if (ret && sparse_masks)
>> + ksft_print_msg("Non-contiguous CBMs supported but write failed\n");
>> + else if (ret && !sparse_masks)
>> + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n");
>> + else if (!ret && !sparse_masks)
>> + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n");
>
>Can these messages be made more specific with a "write" changed to "write of
>non-contiguous CBM"

Sure, will fix it.

>> +
>> + return !ret == !sparse_masks;
>
>Please return negative on error. Ilpo just did a big cleanup to address this.

I believe this is resolved now.

>> +}
>> +
>> +static bool noncont_cat_feature_check(const struct resctrl_test *test)
>> +{
>> + if (!resctrl_resource_exists(test->resource))
>> + return false;
>> +
>> + return resctrl_cache_feature_exists(test->resource, "sparse_masks");
>> +}
>> +
>> struct resctrl_test l3_cat_test = {
>> .name = "L3_CAT",
>> .group = "CAT",
>> @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = {
>> .feature_check = test_resource_feature_check,
>> .run_test = cat_run_test,
>> };
>> +
>> +struct resctrl_test l3_noncont_cat_test = {
>> + .name = "L3_NONCONT_CAT",
>> + .group = "CAT",
>> + .resource = "L3",
>> + .feature_check = noncont_cat_feature_check,
>> + .run_test = noncont_cat_run_test,
>> +};
>> +
>> +struct resctrl_test l2_noncont_cat_test = {
>> + .name = "L2_NONCONT_CAT",
>> + .group = "CAT",
>> + .resource = "L2",
>> + .feature_check = noncont_cat_feature_check,
>> + .run_test = noncont_cat_run_test,
>> +};
>> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
>> index 74041a35d4ba..7b7a48d1fddd 100644
>> --- a/tools/testing/selftests/resctrl/resctrl.h
>> +++ b/tools/testing/selftests/resctrl/resctrl.h
>> @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start);
>> int get_full_cbm(const char *cache_type, unsigned long *mask);
>> int get_mask_no_shareable(const char *cache_type, unsigned long *mask);
>> int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size);
>> +int get_cache_level(const char *cache_type);
>> int read_info_res_file(const char *resource, const char *filename);
>> void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
>> int signal_handler_register(void);
>> @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test;
>> extern struct resctrl_test mba_test;
>> extern struct resctrl_test cmt_test;
>> extern struct resctrl_test l3_cat_test;
>> +extern struct resctrl_test l3_noncont_cat_test;
>> +extern struct resctrl_test l2_noncont_cat_test;
>>
>> #endif /* RESCTRL_H */
>> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
>> index 3044179ee6e9..f3dc1b9696e7 100644
>> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
>> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
>> @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = {
>> &mba_test,
>> &cmt_test,
>> &l3_cat_test,
>> + &l3_noncont_cat_test,
>> + &l2_noncont_cat_test,
>> };
>>
>> static int detect_vendor(void)
>> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
>> index 8546421f0940..8bd30973fec3 100644
>> --- a/tools/testing/selftests/resctrl/resctrlfs.c
>> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
>> @@ -100,7 +100,7 @@ int umount_resctrlfs(void)
>> *
>> * Return: cache level as integer or -1 if @cache_type is invalid.
>> */
>> -static int get_cache_level(const char *cache_type)
>> +int get_cache_level(const char *cache_type)
>> {
>> if (!strcmp(cache_type, "L3"))
>> return 3;
>
>
>Reinette

--
Kind regards
Maciej Wieczór-Retman