Re: [EXT] Re: [RFC 00/12] ARM: MPAM: add support for priority partitioning control

From: Reinette Chatre
Date: Thu Aug 24 2023 - 14:01:35 EST


Hi Amit,

On 8/24/2023 1:52 AM, Amit Singh Tomar wrote:
> Hi Reinette,
>
> Thanks for your prompt response.
>
> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
> Sent: Thursday, August 24, 2023 3:50 AM
> To: Amit Singh Tomar <amitsinght@xxxxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: fenghua.yu@xxxxxxxxx; james.morse@xxxxxxx; George Cherian <gcherian@xxxxxxxxxxx>; robh@xxxxxxxxxx; peternewman@xxxxxxxxxx; Luck, Tony <tony.luck@xxxxxxxxx>
> Subject: Re: [EXT] Re: [RFC 00/12] ARM: MPAM: add support for priority partitioning control
>
> Hi Amit,
>
> On 8/23/2023 2:33 PM, Amit Singh Tomar wrote:
>> Hi Reinette,
>>
>> (Kindly follow the responses in a top-to-bottom sequence).
>>
>> -----Original Message-----
>> From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
>> Sent: Thursday, August 24, 2023 12:37 AM
>> To: Amit Singh Tomar <amitsinght@xxxxxxxxxxx>;
>> linux-kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>> Cc: fenghua.yu@xxxxxxxxx; james.morse@xxxxxxx; George Cherian
>> <gcherian@xxxxxxxxxxx>; robh@xxxxxxxxxx; peternewman@xxxxxxxxxx; Luck,
>> Tony <tony.luck@xxxxxxxxx>
>> Subject: Re: [EXT] Re: [RFC 00/12] ARM: MPAM: add support for priority
>> partitioning control
>>
>> Hi Amit,
>>
>> On 8/22/2023 5:44 AM, Amit Singh Tomar wrote:
>>> Hi Reinette,
>>>
>>> Thanks for having a look!
>>>
>>> -----Original Message-----
>>> From: Reinette Chatre <reinette.chatre@xxxxxxxxx>
>>> Sent: Friday, August 18, 2023 12:41 AM
>>> To: Amit Singh Tomar <amitsinght@xxxxxxxxxxx>;
>>> linux-kernel@xxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
>>> Cc: fenghua.yu@xxxxxxxxx; james.morse@xxxxxxx; George Cherian
>>> <gcherian@xxxxxxxxxxx>; robh@xxxxxxxxxx; peternewman@xxxxxxxxxx;
>>> Luck, Tony <tony.luck@xxxxxxxxx>
>>> Subject: [EXT] Re: [RFC 00/12] ARM: MPAM: add support for priority
>>> partitioning control
>>>
>>> External Email
>>>
>>> ---------------------------------------------------------------------
>>> -
>>> (+Tony)
>>>
>>> Hi Amit,
>>>
>>> On 8/15/2023 8:27 AM, Amit Singh Tomar wrote:
>>>> Arm Memory System Resource Partitioning and Monitoring (MPAM)
>>>> supports different controls that can be applied to different
>>>> resources in the system For instance, an optional priority
>>>> partitioning control where priority value is generated from one MSC,
>>>> propagates over interconnect to other MSC (known as downstream
>>>> priority), or can be applied within an MSC for internal operations.
>>>>
>>>> Marvell implementation of ARM MPAM supports priority partitioning
>>>> control that allows LLC MSC to generate priority values that gets
>>>> propagated (along with read/write request from upstream) to DDR Block.
>>>> Within the DDR block the priority values is mapped to different traffic class under DDR QoS strategy.
>>>> The link[1] gives some idea about DDR QoS strategy, and terms like
>>>> LPR, VPR and HPR.
>>>>
>>>> Setup priority partitioning control under Resource control
>>>> ----------------------------------------------------------
>>>> At present, resource control (resctrl) provides basic interface to
>>>> configure/set-up CAT (Cache Allocation Technology) and MBA (Memory Bandwidth Allocation) capabilities.
>>>> ARM MPAM uses it to support controls like Cache portion partition
>>>> (CPOR), and MPAM bandwidth partitioning.
>>>>
>>>> As an example, "schemata" file under resource control group contains
>>>> information about cache portion bitmaps, and memory bandwidth
>>>> allocation, and these are used to configure Cache portion partition (CPOR), and MPAM bandwidth partitioning controls.
>>>>
>>>> MB:0=0100
>>>> L3:0=ffff
>>>>
>>>> But resctrl doesn't provide a way to set-up other control that ARM
>>>> MPAM provides (For instance, Priority partitioning control as
>>>> mentioned above). To support this, James has suggested to use
>>>> already existing schemata to be compatible with portable software,
>>>> and this is the main idea behind this RFC is to have some kind of discussion on how resctrl can be extended to support priority partitioning control.
>>>>
>>>> To support Priority partitioning control, "schemata" file is updated
>>>> to accommodate priority field (upon priority partitioning capability
>>>> detection), separated from CPBM using delimiter ",".
>>>>
>>>> L3:0=ffff,f where f indicates downstream priority max value.
>>>>
>>>> These dspri value gets programmed per partition, that can be used to
>>>> override QoS value coming from upstream (CPU).
>>>>
>>>> RFC patch-set[2] is based on James Morse's MPAM snapshot[3] for 6.2,
>>>> and ACPI table is based on DEN0065A_MPAM_ACPI_2.0.
>>>>
>>>
>>> There are some aspects of this that I think we should be cautious
>>> about. First, there may inevitably be more properties in the future
>>> that need to be associated with a resource allocation, these may
>>> indeed be different between architectures and individual platforms.
>>> Second, user space need a way to know which properties are supported
>>> and what valid parameters may be.
>>>
>>> On a high level I thus understand the goal be to add support for
>>> assigning a property to a resource allocation with "Priority
>>> partitioning control" being the first property.
>>
>>> To that end, I have a few questions:
>>> * How can this interface be expanded to support more properties with the
>>> expectation that a system/architecture may not support all resctrl supported
>>> properties?
>>> [>>] All these new controls ("Priority partitioning is one of them) detected as resource capabilities (via Features Identification Register), and these control will not be probed, if system/architecture
>>> doesn't support it. From resource control side, this means that users will never get to know about the controls from schemata file. For instance, the platform that supports Priority partitioning control
>>> schemata file looks like:
>>>
>>> # cat schemata
>>> L3:1=ffff
>>>
>>> As oppose to when system has Priority partitioning control
>>> # cat schemata
>>> L3:1=ffff,f
>>>
>>
>> Right, but my question is "How can this interface be expanded ...".
>> Consider a future L3 resource that has a new and different property
>> ("new_property") that is independent from "Priority partitioning".
>> If "L3:1=ffff,f" means "Priority partitioning" == 0xf, how can a value be assigned to "new_property" if the system's L3 supports it but not "Priority partitioning"?
>> If I understand correctly the proposed interface is a positional interface and "Priority partitioning" is always in second field ...
>>
>> [>>] Yes, "Priority partitioning" will always be the second field.
>>
>> but a system may or may not support this property so does it require an empty second field to be able to use other properties?
>>
>> [>>] Yes, in the absence of this control ("Priority partitioning"), second field will be taken by other control (if supported).
>>
>> So, for example, if L3 resource is equipped with two controls, .i.e. CPOR and PPART, schemata will look like:
>>
>> L3:0=XXXX,PPART=X
>>
>> and, if same resource is equipped with another set of controls, .i.e. CPOR and CCAP (cache capacity partitioning), schemata will look like:
>>
>> L3:0=XXXX,CCAP=X
>>
>> and, in case resource is equipped with all three controls, schemata will look like:
>>
>> L3:0=XXXX,PPART=X,CCAP=X
>>
>> Each of these combinations, features its own format specifier.
>>
>
> I see. I do have a similar concern as Peter regarding the impact of this change on parsing of the schemata file. I peeked at intel-cmt-cat's implementation [1] and if I understand it correctly these changes will break it. This is just one example but I do think this will have significant impact on user space that should be avoided.[>>]
>
> [>>] To be honest, I don't see how it breaks things on x86 side. None of these new controls (PPART, or CCAP) exist for intel platform, and in absence of these control, schemata file remains the same, .i.e.
> L3:0=ffff
>
> Or you're talking about the situation when intel may have similar control, and this proposed approach would break intel-cmt-cat then?

There are indeed two parts to this. First, I still consider
this as breaking user space because user space interacts with
"resctrl" that should be a generic interface. Second, yes, any
"resctrl" interface is available to every vendor. It is not expected
that all systems support all features but resctrl is the interface
with which user space can query what features are supported in
order to interact with the features.

> Apart from this this discussion focused on the display of properties when user views the schemata file. We also need to consider how the user will provide new data by writing to the schemata file.
> For example, I do not think it is convenient for the user to have to provide the allocation bitmask every time the "Priority partitioning" value needs to be changed for a resource instance.
>
> [>>] This is something, I was pondering about, not to provide allocation bitmask while changing "Priority partitioning" values or vice-versa but ARM MPAM device driver
> run through all the resource instances (learned from ACPI table) and program the ris_idx (along with partid) into MPAMCFG_PART_SEL_NS[RIS].
> After that, programs the portion bit map (related to CPOR), or Priority value (depends on the ris_idx) into MPAMCFG_CPBM_NS or MPAMCFG_PRI_NS[dspri].
>
> As example, for resource index 0 (MPAMCFG_PART_SEL_NS[0]), it programs Priority value, and for resource index 1 ( MPAMCFG_PART_SEL_NS[1]), it programs portion bitmap value. In a way, Driver[1]
> Expects both these values to be supplied. May be James can correct me here.

I see obtaining the data from user space as separate from
writing the data to the hardware. resctrl maintains the
hardware configuration internally so it is possible to
have user space modify a portion of the configuration
while still being able to write the entire configuration
to hardware if that is required.


> This may also be solved when considering Peter's idea but since this work depends on other work that is not upstream it is difficult to envision the impact of any suggestions.
>
> [>>] Initially, we have thought about these three approaches:
>
> 1) Populate the resource control filesystem[2] with a new file that corresponds to new control. It requires Priority value to be encoded around portion bitmaps, and James has suggested we should go via
> "schemata" file approach.
>
> I think, this is something Tony has pointed out in other thread.

Synchronizing writes to hardware with updates to separate
files may be a challenge.

>
> 2) Second approach that we discussed internally is to have schemata for CPOR, and PPART separated by new line as mentioned/suggested by Peter, But it may require to tweak
> the ARM MPAM device driver a bit. It was kind of toss-up between 2nd and 3nd approach :), and we went with the 3rd one.
>
> L3:0=XXXX
> L3:0=PPART=X
>
> Will look into it again.

Tony has suggestions here. I think it would be a good exercise to
write a user space client to explore how the interface
can be made most convenient.

Reinette