Re: [RFC PATCH 0/7] Add driver registration i/f to resctrl

From: Reinette Chatre
Date: Tue May 09 2023 - 17:36:09 EST


Hi Tony,

On 5/8/2023 11:32 AM, Luck, Tony wrote:
> Reinette,
>
> Thanks for the review. Your comments on individual patches are all good, but I'll try to address
> them here to keep all the pieces together.
>
> You sum up everything in your concise:
>
> "I would like to understand these two motivations better."
>
> I'll try to give some additional details. But the specific features of the drivers I
> want to add are still under wraps, so I will have to be somewhat vague at times.
> If that means that a final decision can't be made until more details are forthcoming,
> that's fine. At least the discussion has begun now so people have some time
> to think about this well before I need to get code upstream. Or if the answer
> is "No. We will never create s/w layers in the resctrl code." Then I can go
> back to puzzling some other solutions.
>
> Intel has always had some model specific "RDT" features that we have
> not attempted to push to upstream. Rationale is that architectural features
> have lasting value across CPU generations. Once the code is upstream it
> "just works" (TM) for all time. Cluttering up core code with features that are
> only applicable to one or two CPU generations seemed like a recipe for
> disaster from a long-term maintenance perspective.

Could you please elaborate how this seems like a "recipe for disaster"? I
can certainly see how removing a driver is easy when it is decided that
something is "end of life". I rarely see "end of life" in practice
though and viewing removal of obsolete code from a driver as a "disaster"
is not clear to me. Re-factoring code occurs frequently.

> But things are changing. Previously the end user industries that wanted
> the model specific features were content for out-of-tree patches for their
> specific systems. We now have some features that may be useful to a
> wider audience, ones that requires a solution integrated into upstream
> so that they don't take on technical debt to move the solution to new
> kernel versions.
>
> I'm also guessing that with other architectures (AMD, ARM, RISC-V)
> all building on the resctrl foundation, some of them may also have some
> features that don't fit neatly into the core of resctrl.

They absolutely have features that don't fit neatly into the core of
resctrl. I am not able to tell whether this style of interface would
solve this.

The second motivation from your original email was that these new features
have "input parameters that do not fit neatly into the existing schemata
model". It is not obvious to me where you address that, but maybe it is
related to "Q6".

> My RFC patches were just to show that creating a s/w layer in resctrl
> is possible. If there is a better dividing line between core code and
> architecture/model specific code I'm happy to discuss cleaner ways to
> draw the line. E.g. you make the point here and in one of your comments
> to the individual patches that making new resctrl resources may be a
> cleaner solution. I didn't do it that way partly because the existing code
> has a static array of resources. But it might be relatively simple to transform
> that into a list to make dynamically adding new resources easier. But see
> Q6 below about tracking domains in a resource for other challenges.
>
> Specific questions you raise:
>
> Q1) Will there be a need for drivers to "call into" resctrl rather than rely
> on call backs? May make locking complex.
>
> A1) So far I haven't found a case. But I only have three drivers (in addition
> to the example one) so it is possible that more complex things may be needed.
> I agree that this will raise many locking challenges. Exporting the resctrl
> mutex seems like a terrible idea.

I agree that we should not export the mutex. You may be interested in the work
that James is working on separating the locks.
https://lore.kernel.org/lkml/20230320172620.18254-1-james.morse@xxxxxxx/

It would be great if you could join these discussions to see if there are
some common requirements that can be solved together.

> Q2) What about exclusive groups?
> A2) I didn’t try to handle in this RFC. Additional bits will be needed.

It seems like every time a driver needs "additional bits" it would impact
all the other drivers.

>
> Q3) How to make visible to the driver other resctrl assumptions (e.g. default
> group is CLOSID=0, RMID=0).

Actually I think resctrl assumptions should be _invisible_ to drivers.

> A3) I think this specific example is unlikely to ever change (it is somewhat tied
> to the power-on/reset state of the IA32_PQR_ASSOC register. But the general
> point is true that assumptions by drivers may create challenges to refactor core
> code in ways that break those assumptions.
>
> Q4) Suppressing schemata resources from a driver surprised you.
> A4) This is, as you guessed, about conflicting resources. There are other ways it
> could be handled. E.g. use existing mount options to suppress the resource from
> the schemata. To be safe that might also need some way to fail to load of a driver
> that needs other access to a resource unless the correct mount options are in
> force.

>From above it sounds like there may be scenarios where a driver layer would still
be accompanied by core changes (like mount options added to the core that will
allow/deny certain drivers). If there was no driver layer it could just be handled
in a single spot.

The second part of my original question was "Where would it be decided whether
the overriding driver should be loaded and why can that logic not be in
enumeration within resctrl?" It is the user that needs to determine that there are
conflicting resources?

> Q5) Boundaries are not clear. Too much of resctrl internals made visible to drivers.
> A5) Can work on more abstract interfaces if we move forward with some sort of
> layered approach.
>
> Q6) Domain awareness of drivers.
> A6) This is a challenge. Especially as the domain for a driver may not match up
> with any existing resource scope (e.g. driver may be socket scoped, which may
> not be the same as "L3 cache" scoped). After I posted this series I added
> an entry in the resource table with socket scope to handle this. Dynamically adding
> a new resource with a custom scope has challenges (because the domain lists
> attached to that resource are maintained by the resctrl cpu hot plug callbacks as
> CPUs come online and go offline.

My comment was not about a need to make drivers "domain aware". My assumption was that
drivers are not domain aware since I did not see any related information shared
with the drivers and since the drivers override the schemata entries I thus assumed
that the schemata entries use some driver specific scope.
The challenge to add a resource with a custom scope seems like the biggest problem
raised thus far. Is this perhaps what started the venture down this driver interface?

Reinette