RE: [RFC PATCH 0/7] Add driver registration i/f to resctrl

From: Luck, Tony
Date: Mon May 08 2023 - 14:32:40 EST


Reinette,

Thanks for the review. Your comments on individual patches are all good, but I'll try to address
them here to keep all the pieces together.

You sum up everything in your concise:

"I would like to understand these two motivations better."

I'll try to give some additional details. But the specific features of the drivers I
want to add are still under wraps, so I will have to be somewhat vague at times.
If that means that a final decision can't be made until more details are forthcoming,
that's fine. At least the discussion has begun now so people have some time
to think about this well before I need to get code upstream. Or if the answer
is "No. We will never create s/w layers in the resctrl code." Then I can go
back to puzzling some other solutions.

Intel has always had some model specific "RDT" features that we have
not attempted to push to upstream. Rationale is that architectural features
have lasting value across CPU generations. Once the code is upstream it
"just works" (TM) for all time. Cluttering up core code with features that are
only applicable to one or two CPU generations seemed like a recipe for
disaster from a long-term maintenance perspective.

But things are changing. Previously the end user industries that wanted
the model specific features were content for out-of-tree patches for their
specific systems. We now have some features that may be useful to a
wider audience, ones that requires a solution integrated into upstream
so that they don't take on technical debt to move the solution to new
kernel versions.

I'm also guessing that with other architectures (AMD, ARM, RISC-V)
all building on the resctrl foundation, some of them may also have some
features that don't fit neatly into the core of resctrl.

My RFC patches were just to show that creating a s/w layer in resctrl
is possible. If there is a better dividing line between core code and
architecture/model specific code I'm happy to discuss cleaner ways to
draw the line. E.g. you make the point here and in one of your comments
to the individual patches that making new resctrl resources may be a
cleaner solution. I didn't do it that way partly because the existing code
has a static array of resources. But it might be relatively simple to transform
that into a list to make dynamically adding new resources easier. But see
Q6 below about tracking domains in a resource for other challenges.

Specific questions you raise:

Q1) Will there be a need for drivers to "call into" resctrl rather than rely
on call backs? May make locking complex.

A1) So far I haven't found a case. But I only have three drivers (in addition
to the example one) so it is possible that more complex things may be needed.
I agree that this will raise many locking challenges. Exporting the resctrl
mutex seems like a terrible idea.

Q2) What about exclusive groups?
A2) I didn’t try to handle in this RFC. Additional bits will be needed.

Q3) How to make visible to the driver other resctrl assumptions (e.g. default
group is CLOSID=0, RMID=0).
A3) I think this specific example is unlikely to ever change (it is somewhat tied
to the power-on/reset state of the IA32_PQR_ASSOC register. But the general
point is true that assumptions by drivers may create challenges to refactor core
code in ways that break those assumptions.

Q4) Suppressing schemata resources from a driver surprised you.
A4) This is, as you guessed, about conflicting resources. There are other ways it
could be handled. E.g. use existing mount options to suppress the resource from
the schemata. To be safe that might also need some way to fail to load of a driver
that needs other access to a resource unless the correct mount options are in
force.

Q5) Boundaries are not clear. Too much of resctrl internals made visible to drivers.
A5) Can work on more abstract interfaces if we move forward with some sort of
layered approach.

Q6) Domain awareness of drivers.
A6) This is a challenge. Especially as the domain for a driver may not match up
with any existing resource scope (e.g. driver may be socket scoped, which may
not be the same as "L3 cache" scoped). After I posted this series I added
an entry in the resource table with socket scope to handle this. Dynamically adding
a new resource with a custom scope has challenges (because the domain lists
attached to that resource are maintained by the resctrl cpu hot plug callbacks as
CPUs come online and go offline.

-Tony