Re: [PATCH v2 4/4] selftests/resctrl: Add non-contiguous CBMs CAT test

From: Reinette Chatre
Date: Mon Jan 22 2024 - 12:32:11 EST


Hi Maciej,

On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote:
> Hi!
>
> On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote:
>> Hi Maciej,
>>
>> On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote:
>>> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote:
>>>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote:
>>>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote:
>>>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote:
>>>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote:
>>>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote:
>>>>
>>>>>>>>> + bit_center = count_bits(full_cache_mask) / 2;
>>>>>>>>> + cont_mask = full_cache_mask >> bit_center;
>>>>>>>>> +
>>>>>>>>> + /* Contiguous mask write check. */
>>>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
>>>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
>>>>>>>>> + if (ret)
>>>>>>>>> + return ret;
>>>>>>>>
>>>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios
>>>>>>>> and it is not obvious to me if the issue will be clear if this test,
>>>>>>>> noncont_cat_run_test(), fails.
>>>>>>>
>>>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If
>>>>>>> the contiguous mask write fails, write_schemata should print out what was wrong
>>>>>>> and I believe that the test will report an error rather than failure.
>>>>>>
>>>>>> Right. I am trying to understand whether the user will be able to decipher what failed
>>>>>> in case there is an error. Seems like in this case the user is expected to look at the
>>>>>> source code of the test to understand what the test was trying to do at the time it
>>>>>> encountered the failure. In this case user may be "lucky" that this test only has
>>>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that
>>>>>> reasoning to figure out which write_schemata() failed to further dig what test was
>>>>>> trying to do.
>>>>>
>>>>> When a write_schemata() is executed the string that is being written gets
>>>>> printed. If there are multiple calls in a single tests and one fails I'd imagine
>>>>> it would be easy for the user to figure out which one failed.
>>>>
>>>> It would be easy for the user the figure out if (a) it is obvious to the user
>>>> what schema a particular write_schema() call attempted to write and (b) all the
>>>> write_schema() calls attempt to write different schema.
>>>
>>> Okay, your comment made me wonder if on error the schemata still is printed. I
>>> double checked in the code and whether write_schemata() fails or not it has a
>>> goto path where before returning it will print out the schema. So I believe that
>>> satisfies your (a) condition.
>>
>> Let me try with an example.
>> Scenario 1:
>> The test has the following code:
>> ...
>> write_schemata(..., "0xfff", ...);
>> ...
>> write_schemata(..., "0xf0f", ...);
>> ...
>>
>> Scenario 2:
>> The test has the following code:
>> ...
>> write_schemata(..., schemata, ...);
>> ...
>> write_schemata(..., schemata, ...);
>> ...
>>
>> Any failure of write_schemata() in scenario 1 will be easy to trace. As you
>> state, write_schemata() prints the schemata attempted and it will thus be
>> easy to look at the code to see which write_schemata() call failed since it
>> is obvious from the code which schemata was attempted.
>> A failure of one of the write_schemata() in scenario 2 will not be as easy
>> to trace since the user first needs to determine what the value of "schemata"
>> is at each call and that may depend on the platform, bit shifting done in test,
>> and state of system state at time of test.
>
> Doing things similar to scenario 1 would be great from a debugging perspective
> but since the masks can have different sizes putting literals there seems
> impossible.
>
> Maybe the code could be improved by putting an example CBM in the comment above
> a write_schemata() call? "For a 12 bit maximum CBM value, the contiguous
> schemata will look like '0x3f'" and "For a 12 bit maximum CBM value, the
> non-contiguous schemata will look like '0xf0f'"
>
> This seems like the closest I could get to what you're
> showing in scenario 1 (which I assume would be the best).

I am not asking you to use literals. I am trying to demonstrate that the only way
it would be obvious to the user where a failure is is when the test uses literals.
I continue to try to motivate for clear indication to user/developer what failed
when this test failed ... this could just be a ksft_print_msg() when the
write_schemata() call we are talking about fails.

>
>>> As for (b) depends on what you meant. Other tests that run more than one
>>> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest
>>> that the non-contiguous test should attempt more schematas? For example shift
>>> the bit hole from one side to the other? I assumed one CBM with a centered bit
>>> hole would be enough to check if non-contiguous CBM feature works properly and
>>> more CBMs would be redundant.
>>
>> Let me try with an example.
>> Scenario 1:
>> The test has the following code:
>> ...
>> write_schemata(..., "0xfff", ...);
>> ...
>> write_schemata(..., "0xf0f", ...);
>> ...
>>
>> Scenario 2:
>> The test has the following code:
>> ...
>> write_schemata(..., "0xfff", ...);
>> ...
>> write_schemata(..., "0xfff", ...);
>> ...
>>
>> A failure of either write_schemata() in scenario 1 will be easy to trace since
>> the schemata attempted is different in each case. The schemata printed by the
>> write_schemata() error message can thus easily be connected to the specific
>> write_schemata() call.
>> A failure of either write_schemata() in scenario 2 is not so obvious since they
>> both attempted the same schemata so the error message printed by write_schemata()
>> could belong to either.
>
> I believe my code follows the first scenario example (since one schemata is half
> the full CBM, and the other one is the full CBM with a hole in the middle).

I know. This thread digressed into discussion about when it would be ok to
omit error message from caller of write_schemata().

> I'm sorry to drag this thread out but I want to be sure if I'm right or are you
> suggesting something and I missed it?

Please just add a ksft_print_msg() to noncont_cat_run_test() when this
write_schemata() fails.

Reinette