Re: resctrl2 - status

From: Tony Luck
Date: Fri Aug 25 2023 - 16:59:32 EST


On Fri, Aug 25, 2023 at 01:20:22PM -0700, Reinette Chatre wrote:
> Hi Tony,
>
> On 8/25/2023 12:44 PM, Luck, Tony wrote:
> >>>> Alternatively, can user space just take a "load all resctrl modules
> >>>> and see what sticks" (even modules of different architectures since
> >>>> a user space may want to be generic) approach?
> >>>
> >>> This mostly works. Except for the cases where different modules access
> >>> the same underlying hardware, so can't be loaded together.
> >>>
> >>> Examples:
> >>>
> >>> rdt_l3_cat vs. rdt_l3_cdp - user needs to decide whether they want CDP or not.
> >>> But this is already true ... they have to decide whether to pass the "-o cdp" option
> >>> to mount.
> >>>
> >>> rdt_l3_mba vs. rdt_l3_mba_MBps - does the user want to control memory bandwidth
> >>> with percentages, or with MB/sec values. Again the user already has to make this
> >>> decision when choosing mount options.
> >>>
> >>>
> >>> Maybe the "What resctrl options does this machine support?" question would be
> >>> best answered with a small utility?
> >>
> >> A user space utility or a kernel provided utility? If it is a user space utility
> >> I think it would end up needing to duplicate what the kernel is required to do
> >> to know if a particular feature is supported. It seems appropriate that this
> >> could be a kernel utility that can share this existing information with user
> >> space. resctrl already supports the interface for this via /sys/fs/resctrl/info.
> >
> > I was imagining a user space utility. Even though /proc/cpuinfo doesn't show
> > all features, a utility has access to all the CPUID leaves that contain the
> > details of each feature enumeration.
>
> For x86 that may work (in some scenarios, see later) for now but as I understand
> Arm would need a different solution where I believe the information is obtained
> via ACPI. I think it is unnecessary to require user space to have parsers for
> CPUID and ACPI if that same information needs to be parsed by the kernel and
> there already exists an interface with which the information is communicated
> from kernel to user space. Also, just because information CPUID shows a feature
> is supported by the hardware does not mean that the kernel has support for that
> feature. This could be because of a feature mismatch between user space and
> kernel, or even some features disabled for use via the, for example "rdt=!l3cat",
> kernel parameter.

Agreed this is complex, and my initial resctrl2 proposal lacks
functionality in this area.

> >> fyi ... as with previous attempts to discuss this work I find it difficult
> >> to discuss this work when you are selective about what you want to discuss/answer
> >> and just wipe the rest. Through this I understand that I am not your target
> >> audience.
> >
> > Not my intent. I value your input highly. I'm maybe too avid a follower of the
> > "trim your replies" school of e-mail etiquette. I thought I'd covered the gist
> > of your message.
> >
> > I'll try to be more thorough in responding in the future.
>
> Two items from my previous email remain open:
>
> First, why does making the code modular require everything to be loadable
> modules?
> I think that it is great that the code is modular. Ideally it will help to
> support the other architectures. As you explain this modular design also
> has the benefit that "modules" can be loaded and unloaded after resctrl mount.
> Considering your example of MBA and MBA_MBps support ... if I understand
> correctly with code being modular it enables changes from one to the other
> after resctrl mount. User can start with MBA and then switch to MBA_MBps
> without needing to unmount resctrl. What I do not understand is why does
> the code being modular require everything to be modules? Why, for example,
> could a user not interact with a resctrl file that enables the user to make
> this switch from, for example, MBA to MBA_MBps? With this the existing
> interfaces can remain to be respected, the existing mount parameters need
> to remain anyway, while enabling future "more modular" usages.

Lots of advantages to modules:
1) Only load what you need.
- saves memory
- reduces potential attack surface
- may avoid periodic timers (e.g. for MBM overflow and
for LLC occupancy "limbo" mode).
2) If there is a security fix, can be deployed without a reboot.
3) Isolation between different features.
- Makes development and testing simpler

Sure some things like switching MBA to MBA_MBps mode by writing to
a control file are theoretically possible. But they would be far more
complex implementations with many possible oppurtunities for bugs.
I think Vikas made a good choice to make this a mount option rather
than selectable at run time.

> Second, copied from my previous email, what is the plan to deal with current
> users that just mount resctrl and expect to learn from it what features are
> supported?

Do such users exist? Resctrl is a sophisticated system management tool.
I'd expect system administrators deploying it are well aware of the
capabilities of the different types of systems in their data center.

But if I'm wrong, then I have to go back to figure out a way to
expose this information in a better way than randomly running "modprobe"
to see what sticks.

-Tony