Re: [PATCH v8 03/13] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag

From: Reinette Chatre
Date: Tue Nov 22 2022 - 19:09:48 EST


Hi Babu,

On 11/4/2022 1:00 PM, Babu Moger wrote:
> Newer AMD processors support the new feature Bandwidth Monitoring Event
> Configuration (BMEC).
>
> The feature support is identified via CPUID Fn8000_0020_EBX_x0 (ECX=0).
> Bits Field Name Description
> 3 EVT_CFG Bandwidth Monitoring Event Configuration (BMEC)
>
> Currently, the bandwidth monitoring events mbm_total_bytes and
> mbm_local_bytes are set to count all the total and local reads/writes
> respectively. With the introduction of slow memory, the two counters
> are not enough to count all the different types of memory events. With
> the feature BMEC, the users have the option to configure
> mbm_total_bytes and mbm_local_bytes to count the specific type of
> events.
>
> Each BMEC event has a configuration MSR, QOS_EVT_CFG (0xc000_0400h +
> EventID) which contains one field for each bandwidth type that can be

Looking at later patches it seems that it is not really 0xc000_0400h +
EventID but instead "0xc000_0400h + index_based_on_EventID"? This may be
too much detail for this changelog so maybe these specifics can
be deferred and just refer to the "configuration MSR".

> used to configure the bandwidth event to track any combination of
> supported bandwidth types. The event will count requests from every
> bandwidth type bit that is set in the corresponding configuration
> register.
>
> Following are the types of events supported:
>
> ==== ========================================================
> Bits Description
> ==== ========================================================
> 6 Dirty Victims from the QOS domain to all types of memory
> 5 Reads to slow memory in the non-local NUMA domain
> 4 Reads to slow memory in the local NUMA domain
> 3 Non-temporal writes to non-local NUMA domain
> 2 Non-temporal writes to local NUMA domain
> 1 Reads to memory in the non-local NUMA domain
> 0 Reads to memory in the local NUMA domain
> ==== ========================================================
>
> By default, the mbm_total_bytes configuration is set to 0x7F to count
> all the event types and the mbm_local_bytes configuration is set to
> 0x15 to count all the local memory events.
>
> Feature description is available in the specification, "AMD64
> Technology Platform Quality of Service Extensions, Revision: 1.03
> Publication
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kernel/cpu/scattered.c | 1 +
> 3 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index d68b4c9c181d..6732ca0117be 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -306,6 +306,7 @@
> #define X86_FEATURE_RSB_VMEXIT_LITE (11*32+17) /* "" Fill RSB on VM exit when EIBRS is enabled */
> #define X86_FEATURE_CALL_DEPTH (11*32+18) /* "" Call depth tracking for RSB stuffing */
> #define X86_FEATURE_SMBA (11*32+19) /* Slow Memory Bandwidth Allocation */
> +#define X86_FEATURE_BMEC (11*32+20) /* AMD Bandwidth Monitoring Event Configuration (BMEC) */

Surely a nitpick but it is strange that the two features introduced in this
series are described differently. Why does SMBA deserve the "AMD" prefix
but BMEC does not? I do not think the "(BMEC)" is necessary since
it is in X86_FEATURE_BMEC.

> /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
> #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index c881bcafba7d..4555f9596ccf 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -68,6 +68,7 @@ static const struct cpuid_dep cpuid_deps[] = {
> { X86_FEATURE_CQM_OCCUP_LLC, X86_FEATURE_CQM_LLC },
> { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
> { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
> + { X86_FEATURE_BMEC, X86_FEATURE_CQM_LLC },
> { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
> { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
> { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index 5a5f17ed69a2..67c4d24e06ef 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -45,6 +45,7 @@ static const struct cpuid_bit cpuid_bits[] = {
> { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
> { X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
> { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
> + { X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
> { X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
> { X86_FEATURE_AMD_LBR_V2, CPUID_EAX, 1, 0x80000022, 0 },
> { 0, 0, 0, 0, 0 }
>
>

Reinette