Re: [PATCH] x86/mce: Schedule mce_setup() on correct CPU for CPER decoding

From: Borislav Petkov
Date: Thu Jun 15 2023 - 11:20:55 EST


On Mon, Apr 17, 2023 at 04:20:06PM +0000, Yazen Ghannam wrote:
> @@ -97,20 +102,13 @@ int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
> if (ctx_info->reg_arr_size < 48)
> return -EINVAL;
>
> - mce_setup(&m);
> -
> - m.extcpu = -1;
> - m.socketid = -1;
> -
> - for_each_possible_cpu(cpu) {
> - if (cpu_data(cpu).initial_apicid == lapic_id) {
> - m.extcpu = cpu;
> - m.socketid = cpu_data(m.extcpu).phys_proc_id;
> + for_each_possible_cpu(cpu)
> + if (cpu_data(cpu).initial_apicid == lapic_id)
> break;
> - }
> - }
>
> - m.apicid = lapic_id;
> + if (smp_call_function_single(cpu, __mce_setup, &m, 1))

I can see the following call-chain from NMI context which is a no-no:

ghes_notify_nmi
|-> ghes_in_nmi_spool_from_list
|-> ghes_in_nmi_queue_one_entry
|-> __ghes_panic
|-> __ghes_print_estatus
|-> cper_estatus_print
|-> cper_estatus_print_section
|-> cper_print_proc_ia
|-> arch_apei_report_x86_error
|-> apei_smca_report_x86_error
|-> smp_call_function_single


--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette