Re: [PATCH v5 12/12] Documentation/x86: Update resctrl_ui.rst for new features

From: Reinette Chatre
Date: Tue Oct 04 2022 - 12:15:44 EST


Hi Babu,

On 10/4/2022 7:00 AM, Moger, Babu wrote:
> On 10/3/22 10:36, Reinette Chatre wrote:
>> On 10/3/2022 7:28 AM, Moger, Babu wrote:
>>> On 9/29/22 17:10, Reinette Chatre wrote:
>>>> Hi Babu,
>>>>
>>>> In subject: resctrl_ui.rst -> resctrl.rst
>>>>
>>>> On 9/27/2022 1:27 PM, Babu Moger wrote:
>>>>> Update the documentation for the new features:
>>>>> 1. Slow Memory Bandwidth allocation (SMBA).
>>>>> With this feature, the QOS enforcement policies can be applied
>>>>> to the external slow memory connected to the host. QOS enforcement
>>>>> is accomplished by assigning a Class Of Service (COS) to a processor
>>>>> and specifying allocations or limits for that COS for each resource
>>>>> to be allocated.
>>>>>
>>>>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>>>>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>>>>> are set to count all the total and local reads/writes respectively.
>>>>> With the introduction of slow memory, the two counters are not
>>>>> enough to count all the different types are memory events. With the
>>>> types are memory events -> types of memory events?
>>> Ok Sure
>>>>> feature BMEC, the users have the option to configure mbm_total_bytes
>>>>> and mbm_local_bytes to count the specific type of events.
>>>>>
>>>>> Also add configuration instructions with examples.
>>>>>
>>>>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>>>>> ---
>>>> ...
>>>>
>>>>> +
>>>>> +"mbm_total_config", "mbm_local_config":
>>>>> + These files contain the current event configuration for the events
>>>>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>>>>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>>>>> + The event configuration settings are domain specific. Changing the
>>>>> + configuration on one CPU in a domain would affect the whole domain.
>>>> This contradicts the implementation done in this series where the
>>>> configuration is changed on every CPU in the domain.
>>> How about this?
>>>
>>> The event configuration settings are domain specific and will affect all the CPUs in the domain.
>> There remains a disconnect between this and the implementation that writes the
>> configuration to every CPU.
>>
>> You could make this change to the documentation but then the
>> implementation needs more than "Update MSR_IA32_EVT_CFG_BASE MSR on all
>> the CPUs in cpu_mask" - that comment needs to highlight that the
>> implementation does not follow the architecture and scope rules nor how
>> configuration changes are made in the rest of the driver and why. Previously [1]
>> you indicated that this is based on guidance from hardware team so perhaps you
>> could document it as a hardware quirk related to this feature? At the minimum
>> it should acknowledge the disconnect.
>
> ok. I could document this in the code patch 9([PATCH v5 09/12]
> x86/resctrl: Add sysfs interface to write mbm_total_bytes event configuration.
> Something like this.
>
> /*
> + * Update MSR_IA32_EVT_CFG_BASE MSR on all the CPUs in cpu_mask.

Since multiple MSRs are impacted, how about:

"Update MSR_IA32_EVT_CFG_BASE MSRs ..."

> + * The MSR MSR_IA32_EVT_CFG_BASE is domain specific. Writing the

"The MSRs offset from MSR MSR_IA32_EVT_CFG_BASE are scoped at the domain
level. Writing any of these MSRs on one CPU is supposed to be observed
by all CPUs in the domain."

> + * MSR on one CPU will affect all the CPUs in the domain.

Since this is not the case, perhaps it should be " ...
is supposed to affect all the CPUs ..." instead?

> + * However, the hardware team recommends to update the MSR on
> + * all the CPU threads. It is not clear in the document yet.

To be consistent, could "CPU threads" be "CPUs"?

Could you please be specific about which document you refer to? Although,
I do not think that writing the last part about "the document" adds value
here. You are representing AMD with this submission and you document that
you are following the guidance from the hardware team in this regard.
I think that is sufficient.


> * * Doc will be updated in the next revision.

This is a change that will be made to the kernel source ... what does
"next revision" mean when somebody reads this comment in a few years?

Putting all of the above together, how about:

"Update MSR_IA32_EVT_CFG_BASE MSRs on all the CPUs in cpu_mask. The MSRs
offset from MSR MSR_IA32_EVT_CFG_BASE are scoped at the domain level.
Writing any of these MSRs on one CPU is supposed to be observed by all
CPUs in the domain. However, the hardware team recommends to update these
MSRs on all the CPUs in the domain."

Reinette