Re: resctrl2 - status

From: Tony Luck
Date: Tue Sep 19 2023 - 12:40:20 EST

Next message: Anton Ivanov: "Re: Arches that don't support PREEMPT"
Previous message: Steve French: "Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)"
In reply to: Peter Newman: "Re: resctrl2 - status"
Next in thread: Tony Luck: "Re: resctrl2 - status"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Sep 19, 2023 at 02:53:07PM +0200, Peter Newman wrote:
> Hi Tony,
>
> On Wed, Sep 6, 2023 at 8:21 PM Tony Luck <tony.luck@xxxxxxxxx> wrote:
> > I've just pushed an updated set of patches to:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git resctrl_v65
>
> I'm trying to understand the purpose of the resctrl_resource struct
> defined here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux.git/tree/include/linux/resctrl.h?h=resctrl2_v65#n356
>
> >From the common fields, it seems to be something with a name and some
> info files that can be divided into domains and should be told when
> CPUs are hotplugged. (I skipped the scope field because I couldn't
> find any references outside fs/resctrl2/arch). The remaining fields
> are explicitly disjoint depending on whether we're talking about
> monitoring or allocation.
>
> >From this I don't get a strong sense that a "resource" is really a
> thing and I think James's resctrl_schema struct only for the purpose
> of resource allocation was more the right direction for common code.
> Outwardly, resctrl groups seem to be the only thing relating
> monitoring to allocation.
>

Peter,

It's a good question. I started out with separate control_resource and
monitor_resource structures, but combined them early on when it was
looking like the overall size wasn't all that big (for a structure that
will only have a dozen or so instances) and there were a bunch of common
fields. It would be easy to separate them again if consensus is that is
cleaner. There are only two places (currently) where code walks
resources of all types (mount and unmount). So no big hassle to replace
those with separate for_each_control_resource() and for_each_monitor_resource()

> I skipped the scope field ...

The scope field is for the module to tell the core code the granularity
of the control/monitor resource so it can build a custom domain list
based on L2 cache scope, L3 cache scope, NUMA ndoe scope, core scope,
or anything else that h/w folks dream up. This means only the common
code needs to register a CPU hotplug notifier.

Note that there is no sharing of domains between modules to allow each
module to be fully independent of any other module. This also means
that the domain structure can have a common header, with some module
specific data allocated directly after ... though the need for that
might be going away as I implement James suggestion to keep the common
schemata parsing in the generic code.

> Is there a good reason for the common code to understand relationships
> between monitors and schemas with the same name? If not, I think it
> would make more sense to register monitors and control schemata
> independently.

I'm going back through my code now, breaking it into a patch series
that builds a piece at a time. One of many things to become clear as
I work through this is that the ".name" field in the resource isn't
really useful. It may disappear to be replaced more specific
fields based on usage. There's already one for the name of the "info/"
directory for the resource. I'm adding a "schemata_name" for control
resources. When I get to the monitor section I'll likley add a
field that gets used to construct the names of directories under
the "mon_data/" directories instead of using ".name". Doing that
would remove any apparent relationship between monitors and
schemas. There's no reason on X86 for them to be connected.

-Tony

Next message: Anton Ivanov: "Re: Arches that don't support PREEMPT"
Previous message: Steve French: "Re: Possible bug report: kernel 6.5.0/6.5.1 high load when CIFS share is mounted (cifsd-cfid-laundromat in"D" state)"
In reply to: Peter Newman: "Re: resctrl2 - status"
Next in thread: Tony Luck: "Re: resctrl2 - status"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]