Re: [PATCH 1/3] x86/resctrl: Display the number of available CLOSIDs

From: Haifeng Xu
Date: Thu Jan 25 2024 - 02:44:16 EST




On 2024/1/25 06:23, Reinette Chatre wrote:
> Hi Haifeng,
>
> Thank you for sharing your requirements as well as submitting
> changes to support them.
>
> I would like to start with a high level overview that applies
> to all three patches you submitted. This relates to customs,
> formatting, and style that will help your contributions get
> into the kernel.
>
> In your next submission, please do submit your patches together
> as a series with a cover letter. This means that your series
> starts with a cover letter (think of it as "patch #0") and the
> patches are sent in reply to that cover letter. This is described
> in more detail in "Documentation/process/5.Posting.rst".
>
> Regarding the patches themselves. Please read and follow
> Documentation/process/coding-style.rst and
> Documentation/process/maintainer-tip.rst regarding customs.
> The former is a general document that applies to the whole kernel
> while the latter contains more specific customs related to the
> area you are are contributing to here.
>
> As a final comment, please ensure that your patches
> pass a "scripts/checkpatch.pl --strict" check. There are more
> details about this in "Documentation/process/5.Posting.rst".
>
> While the documents mentioned above should get you started there
> is a lot of other valuable information in "Documentation/process".
> Consider the index in that directory to help you navigate through
> the available topics.

Thanks for your suggestions.

>
> On 1/23/2024 1:20 AM, Haifeng Xu wrote:
>> We can know the number of CLOSIDs for each rdt resource, for example:
>>
>> cat /sys/fs/resctrl/info/L3/num_closids
>> cat /sys/fs/resctrl/info/MB/num_closids
>> ...
>>
>> The number of available CLOSIDs is the minimal value of them. When users
>> try to create new control groups, to avoid running out of CLOSIDs, they
>> have to traverse /sys/fs/resctrl/ and count the number of directories.
>>
>> To make things more easier, add a RFTYPE_TOP_INFO file 'free_closids'
>> that tells users how many free closids are left.
>
> I do not see this as a change that benefits the kernel or user space.
> It sounds to me as though user space is planning some behavior based
> on what it knows about the current kernel internals and requesting
> more information to peek into these internals to make it easier
> to do so. The kernel can always choose to do things different
> internally, but it is required to maintain a consistent interface to
> user space. We should thus always take great care with new interfaces.
>
> From what I can tell user space intends to use this "free_closids"
> to mean "how many more control resource groups can be created". This
> is not a contract that I think we should enter into. There has been
> discussions aiming to disconnect the number of resource groups
> from the number of closids (effectively letting resource groups
> with the same resource allocations share a closid).

Is this feature merged into latest kernel or just discussions?
Could you please provide more details?
Last time, you mentioned that a monitoring group can be moved
from one control group to another.

This is something
> that the kernel may still do at some point but sharing "free_closids"
> knowing that user space intends to use it as a "number of resource groups
> remaining" counter would make future enhancements like this difficult.

OK, thanks.

>
> Could you please provide more detail in why this is required? User
> space should not need to keep track to know how many groups can be
> created, creating a new group will fail with ENOSPC if no more
> groups can be created.
>

User space reports alerts when failing to create new groups. If no one tell them that
closids aren't enough, they will keep trying to create new groups and the number of alerts
could be very high.

So we want to know how many closids are available, if it's zero, we give up creating
new control groups and those alerts will disappear.

Maybe user behavoirs can be ajusted. There is no need to create too many groups, especially
for those groups with same resource. Or as you mentioned above, we can reuse closid.


> Reinette