Re: [PATCH v3 1/3] perf/x86/amd/lbr: Use freeze based on availability

From: Ingo Molnar
Date: Wed Mar 13 2024 - 06:15:32 EST



* Sandipan Das <sandipan.das@xxxxxxx> wrote:

> Currently, it is assumed that LBR Freeze is supported on all processors
> which have CPUID leaf 0x80000022[EAX] bit 1 set. This is incorrect as

That's X86_FEATURE_AMD_LBR_V2, right? Should probably be mentioned in the
changelog.

> the feature availability is additionally dependent on CPUID leaf
> 0x80000022[EAX] bit 2 being set which may not be set for all Zen 4
> processors. Define a new feature bit for LBR and PMC freeze and set the
> freeze enable bit (FLBRI) in DebugCtl (MSR 0x1d9) conditionally.

What happens on such Zen 4 CPUs that don't support LBR Freeze? Does the CPU
just ignore it, or something worse?

> It should still be possible to use LBR without freeze for profile-guided
> optimization of user programs by using an user-only branch filter during
> profiling. When the user-only filter is enabled, branches are no longer
> recorded after the transition to CPL 0 upon PMI arrival. When branch
> entries are read in the PMI handler, the branch stack does not change.
>
> E.g.
>
> $ perf record -j any,u -e ex_ret_brn_tkn ./workload
>
> Since the feature bit is visible under flags in /proc/cpuinfo, it can be
> used to determine the feasibility of use-cases which require LBR Freeze
> to be supported by the hardware such as profile-guided optimization of
> kernels.

Sounds good to me.

> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 4af140cf5719..e47ea31b019d 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -97,7 +97,7 @@
> #define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */
> #define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */
> #define X86_FEATURE_AMD_LBR_V2 ( 3*32+17) /* AMD Last Branch Record Extension Version 2 */
> -/* FREE, was #define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) "" LFENCE synchronizes RDTSC */
> +#define X86_FEATURE_AMD_LBR_PMC_FREEZE ( 3*32+18) /* AMD LBR and PMC Freeze */
> #define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */
> #define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */
> #define X86_FEATURE_ALWAYS ( 3*32+21) /* "" Always-present feature */

Could you please port this to the latest upstream kernel? The 3*32+18 slot
is now used for another purpose, and we need to define a new synthethic
CPUID word, word 21 if I'm counting it right.

Don't forget to increase NCAPINTS from 21 to 22, and consider the fixed
asserts in the x86_bug_flags[] definitions in <asm/cpufeature.h>, and the
asserts in <asm/disabled-features.h> and <asm/required-features.h>. This
new word should probably be added in a separate preparatory patch.

Thanks,

Ingo