Re: [PATCH] x86/perf/zhaoxin: Add stepping check for ZX-C

From: Borislav Petkov
Date: Sat Feb 04 2023 - 08:45:09 EST


On Thu, Feb 02, 2023 at 05:17:38PM +0800, silviazhao-oc wrote:
> Nano processor may not fully support rdpmc instruction, it works well
> for reading general pmc counter, but will lead GP(general protection)
> when accessing fixed pmc counter. Furthermore, family/mode information
> is same between Nano processor and ZX-C processor, it leads to zhaoxin
> pmu driver is wrongly loaded for Nano processor, which resulting boot
> kernal fail.
>
> To solve this problem, stepping information will be checked to distinguish
> between Nano processor and ZX-C processor.
>
> Fixes: 3a4ac121c2ca (“x86/perf: Add hardware performance events support for Zhaoxin CPU”)
> Reported-by: Arjan <8vvbbqzo567a@xxxxxxxxxxxxxxxxx>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389
> Reported-by: Kevin Brace <kevinbrace@xxxxxxx>
>
> Signed-off-by: silviazhao-oc <silviazhao-oc@xxxxxxxxxxx>

Please use your proper name in the Signed-off-by.

> ---
> arch/x86/events/zhaoxin/core.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
> index 949d845c922b..cef1de251613 100644
> --- a/arch/x86/events/zhaoxin/core.c
> +++ b/arch/x86/events/zhaoxin/core.c
> @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void)
>
> switch (boot_cpu_data.x86) {
> case 0x06:
> - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) {
> + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
> + boot_cpu_data.x86_model == 0x19) {
>
> x86_pmu.max_period = x86_pmu.cntval_mask >> 1;

Last time we talked:

https://lore.kernel.org/r/3c7da7fd-402f-c74f-c96c-0e88828eab58@xxxxxxxxxxx

you said that Nano #GPs when trying to RDPMC the fixed counters. Which
sounds like an erratum.

We do those by adding a X86_BUG flag, set that flag for Nano and then
test it where needed. Grep the source tree for examples.

Please do that above too unstead of testing steppings.

Also, I'd like to know why do steppings < 0xe mean Nano and why isn't
there a more reliable way to detect it?

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette