Re: [PATCH v5 01/12] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag

From: Reinette Chatre
Date: Thu Sep 29 2022 - 17:58:39 EST


Hi Babu,

On 9/27/2022 1:25 PM, Babu Moger wrote:
> Add the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
> enforcement policies can be applied to external slow memory connected
> to the host. QOS enforcement is accomplished by assigning a Class Of
> Service (COS) to a processor and specifying allocations or limits for
> that COS for each resource to be allocated.
>
> This feature is identified by the CPUID Function 8000_0020_EBX_x0.
>
> CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers
> (ECX=0)
>
> Bits Field Name Description
> 2 L3SBE L3 external slow memory bandwidth enforcement
>
>
> Currently, CXL.memory is the only supported "slow" memory device. With
> the support of SMBA feature, the hardware enables bandwidth allocation
> on the slow memory devices. If there are multiple slow memory devices
> in the system, then the throttling logic groups all the slow sources
> together and applies the limit on them as a whole.
>
> The presence of the SMBA feature(with CXL.memory) is independent of
> whether slow memory device is actually present in the system. If there
> is no slow memory in the system, then setting a SMBA limit will have no
> impact on the performance of the system.
>
> Presence of CXL memory can be identified by numactl command.
>
> $numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
> node 0 size: 63678 MB node 0 free: 59542 MB
> node 1 cpus:
> node 1 size: 16122 MB
> node 1 free: 15627 MB
> node distances:
> node 0 1
> 0: 10 50
> 1: 50 10
>
> CPU list for CXL memory will be empty. The cpu-cxl node distance is
> greater than cpu-to-cpu distances. Node 1 has the CXL memory in this
> case. CXL memory can also be identified using ACPI SRAT table and
> memory maps.
>
> Feature description is available in the specification, "AMD64
> Technology Platform Quality of Service Extensions, Revision: 1.03
> Publication # 56375 Revision: 1.03 Issue Date: February 2022".
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/scattered.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index ef4775c6db01..349852b9daa4 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -304,6 +304,7 @@
> #define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
> #define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
> #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
> +#define X86_FEATURE_SMBA (11*32+18) /* Slow Memory Bandwidth Allocation */
>
> /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
> #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index fd44b54c90d5..885ecf46abb2 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
> { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
> { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
> { X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
> + { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
> { X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
> { 0, 0, 0, 0, 0 }
> };
>
>

Please respect the coding style of the area you are modifying.
This is the same feedback as provided in v4 in
https://lore.kernel.org/lkml/ba36c68c-0b13-e8a2-fb45-8b84ea9f7259@xxxxxxxxx/

Looking ahead the same issue also remains in patch 3 as previously
mentioned in v4 feedback.
https://lore.kernel.org/lkml/c4a9ea23-4280-d54c-263b-354ea321f746@xxxxxxxxx/

Also missing is highlighting that configuration has changed from
per-domain to per-CPU and why.

It does not seem as though this series is ready. I will wait
for next version to have existing review comments addressed before
trying to look at new changes.

Reinette