Re: [PATCH v2 4/4] selftests/resctrl: Add non-contiguous CBMs CAT test

From: Reinette Chatre
Date: Fri Jan 19 2024 - 11:39:51 EST


Hi Maciej,

On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>>
>>>>>>> + bit_center = count_bits(full_cache_mask) / 2;
>>>>>>> + cont_mask = full_cache_mask >> bit_center;
>>>>>>> +
>>>>>>> + /* Contiguous mask write check. */
>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
>>>>>>> + if (ret)
>>>>>>> + return ret;
>>>>>>
>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios
>>>>>> and it is not obvious to me if the issue will be clear if this test,
>>>>>> noncont_cat_run_test(), fails.
>>>>>
>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If
>>>>> the contiguous mask write fails, write_schemata should print out what was wrong
>>>>> and I believe that the test will report an error rather than failure.
>>>>
>>>> Right. I am trying to understand whether the user will be able to decipher what failed
>>>> in case there is an error. Seems like in this case the user is expected to look at the
>>>> source code of the test to understand what the test was trying to do at the time it
>>>> encountered the failure. In this case user may be "lucky" that this test only has
>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that
>>>> reasoning to figure out which write_schemata() failed to further dig what test was
>>>> trying to do.
>>>
>>> When a write_schemata() is executed the string that is being written gets
>>> printed. If there are multiple calls in a single tests and one fails I'd imagine
>>> it would be easy for the user to figure out which one failed.
>>
>> It would be easy for the user the figure out if (a) it is obvious to the user
>> what schema a particular write_schema() call attempted to write and (b) all the
>> write_schema() calls attempt to write different schema.
>
> Okay, your comment made me wonder if on error the schemata still is printed. I
> double checked in the code and whether write_schemata() fails or not it has a
> goto path where before returning it will print out the schema. So I believe that
> satisfies your (a) condition.

Let me try with an example.
Scenario 1:
The test has the following code:
...
write_schemata(..., "0xfff", ...);
...
write_schemata(..., "0xf0f", ...);
...

Scenario 2:
The test has the following code:
...
write_schemata(..., schemata, ...);
...
write_schemata(..., schemata, ...);
...

Any failure of write_schemata() in scenario 1 will be easy to trace. As you
state, write_schemata() prints the schemata attempted and it will thus be
easy to look at the code to see which write_schemata() call failed since it
is obvious from the code which schemata was attempted.
A failure of one of the write_schemata() in scenario 2 will not be as easy
to trace since the user first needs to determine what the value of "schemata"
is at each call and that may depend on the platform, bit shifting done in test,
and state of system state at time of test.

> As for (b) depends on what you meant. Other tests that run more than one
> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest
> that the non-contiguous test should attempt more schematas? For example shift
> the bit hole from one side to the other? I assumed one CBM with a centered bit
> hole would be enough to check if non-contiguous CBM feature works properly and
> more CBMs would be redundant.

Let me try with an example.
Scenario 1:
The test has the following code:
...
write_schemata(..., "0xfff", ...);
...
write_schemata(..., "0xf0f", ...);
...

Scenario 2:
The test has the following code:
...
write_schemata(..., "0xfff", ...);
...
write_schemata(..., "0xfff", ...);
...

A failure of either write_schemata() in scenario 1 will be easy to trace since
the schemata attempted is different in each case. The schemata printed by the
write_schemata() error message can thus easily be connected to the specific
write_schemata() call.
A failure of either write_schemata() in scenario 2 is not so obvious since they
both attempted the same schemata so the error message printed by write_schemata()
could belong to either.

Reinette