Re: [PATCH v2 24/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test

From: Reinette Chatre
Date: Wed Apr 26 2023 - 19:35:38 EST


Hi Ilpo,

On 4/26/2023 6:58 AM, Ilpo Järvinen wrote:
> On Fri, 21 Apr 2023, Reinette Chatre wrote:
>> On 4/18/2023 4:45 AM, Ilpo Järvinen wrote:

...

>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 4b505fdb35d7..85053829b9c5 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -11,11 +11,12 @@
>>> #include "resctrl.h"
>>> #include <unistd.h>
>>>
>>> -#define RESULT_FILE_NAME1 "result_cat1"
>>> -#define RESULT_FILE_NAME2 "result_cat2"
>>> -#define NUM_OF_RUNS 5
>>> -#define MAX_DIFF_PERCENT 4
>>> -#define MAX_DIFF 1000000
>>> +#define RESULT_FILE_NAME "result_cat"
>>> +#define NUM_OF_RUNS 5
>>> +#define MIN_DIFF_PERCENT_PER_BIT 2
>>
>> Could you please start a new trend that adds documentation
>> that explains what this constant means and how it was chosen?
>
> I can try although that particular 2 was a bit handwavy that just seems to
> work with the tests I performed.

The changelog claims that the existing CAT test does not work with
this new test offered as replacement. Considering that I do think it
is important to have confidence that this test is able to test CAT.
The words "handwave" and "seems to work" are red flags to me.
When merged, these tests will be run on a variety of platforms with
various configurations. Using test criteria based on measurements
from one particular system may work but there needs to be confidence
that the criteria maps to all systems these tests will be run on.

>
>>> +static unsigned long current_mask;
>>> +static long prev_avg_llc_val;
>>>
>>> /*
>>> * Change schemata. Write schemata to specified
>>> @@ -28,13 +29,24 @@ static int cat_setup(struct resctrl_val_param *p)
>>> int ret = 0;
>>>
>>> /* Run NUM_OF_RUNS times */
>>> - if (p->num_of_runs >= NUM_OF_RUNS)
>>> - return END_OF_TESTS;
>>> + if (p->num_of_runs >= NUM_OF_RUNS) {
>>> + /* Remove one bit from the consecutive block */
>>> + current_mask &= current_mask >> 1;
>>> + if (!current_mask)
>>> + return END_OF_TESTS;
>>> +
>>> + p->num_of_runs = 0;
>>
>> This seems like a workaround to get the schemata to be written. It is
>> problematic since now p->num_of_runs no longer accurately reflects the
>> number of test runs.
>
> This is already the case. MBA test works around this very same problem by
> using a custom static variable (runs_per_allocation) which is reset to 0
> every NUM_OF_RUNS tests and not keeping ->num_of_runs at all. If MBA test
> would replace runs_per_allocation with use of ->num_of_runs, it would
> match what the new CAT test does.
>
> Nothing currently relies on ->num_of_runs counting across the different
> "tests" that are run inside CAT and MBA tests. And I don't have anything
> immediately around the corner that would require ->num_of_runs to count
> total number of repetitions that were ran.
>
> I guess it would be possible to attempt to consolidate that second layer
> MBA and the rewritten CAT tests need somehow into resctrl_val_param. But
> IMHO that too is low-prio refactor as nothing is broken as is.

I do not think that I would use any of the other tests as reference
since all the other tests rely on the same wrapper (resctrl_val())
by providing it their own customization (via aptly named ... struct
resctrl_val_param).
The CAT test is already unique by _not_ using resctrl_val() but its
own test. I do not see why those resctrl_val() customization need to
propagate to the CAT test if it is not using the wrapper to begin with.

>
>> I was expecting this mask manipulation to be
>> in cat_val() so that it is clear how test works instead of part
>> of the logic handled here.
>
> That seems to be moving into opposite direction from how things are
> currently handled. Doing it in cat_val() would be relying less on
> ->setup(). If that's the preferred direction, then the question becomes,
> should CAT test do anything in ->setup() because also the schemata
> writing could be done in directly cat_val().
>
> What I would prefer not to do is to have a rule which says: if there's a
> test-specific function, don't use ->setup() but do any setup directly
> in the test-specific function but, otherwise use ->setup(). Such an
> inconsistency would make things hard to track.

The test specific function can still call a setup function but it
can be done directly instead of via "struct resctrl_val_param". The
test specific function already transitioned away from using resctrl_val(),
it is not clear to me why there should be rules about how
function pointers within "struct resctrl_val_param" should be used or
indeed why "struct resctrl_val_param" should be used at all.

Reinette